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-3)  ambiguity  is  proposed.  The  model  assumes  an  anchor ing-and-ad jus tment 
process  in  which  an  initial  estimate  provides  the  anchor,  and  adjustments 
are  made  for  what  might  be.  The  latter  is  modeled  as  the  result  of  a 
mental  simulation  process  where  the  size  of  the  simulation  is  a  function 
of  the  amount  of  ambiguity,  and  differential  weighting  of  imagined  prob¬ 
abilities  reflects  one's  attitude  toward  ambiguity.  A  two-parameter  model 
of  this  process  is  shown  to  be  consistent  with:  Ellsberg's  original 
paradox,  the  non-additivity  of  complementary  probabilities,  current  psycho 
logical  theories  of  risk,  and  Keynes'  idea  of  the  ^weight  of  evidence."1* — 
The  model  is  tested  in  four  experiments  involving  both  individual  and 
group  analyses.  In  experiments  1  and  2,  the  model  is  shown  to  predict 
judgments  quite  well;  in  experiment  3,  the  inference  model  is  shown  to 
predict  choices  between  gambles;  experiment  4  shows  how  buying  and  selling 
prices  for  insurance  are  systematically  influenced  by  one's  attitude 
toward  ambiguity.  The  results  and  model  are  then  discussed  with  respect 
to:  (1)  the  importance  of  ambiguity  in  assessing  uncertainty;  (2)  the 
use  of  cognitive  strategies  in  judgments  under  ambiguity;  (3)  the  role 
of  ambiguity  in  risky  choice;  and,  (4)  extensions  of  the  model. 


\ 


AMBIGUITY  AMD  UNCERTAINTY  ZN  PROBABILISTIC  INFERENCE 


The  literature  on  how  people  sake  judgments  under  uncertainty  is  large, 
complex,  and  rife  with  controversy  (see  e.g.,  Edwards,  1954,  1968;  Peterson  « 
Beach,  1967;  Slovic  £  Lichtenstein,  1971;  Rappoport  fi  Walls ten,  1972;  Slovic, 
Lichtenstein,  £  Fischhoff,  1977;  Einhorn  s  Hogarth,  1981;  Kahneman,  Slovic,  £ 
Tver sky,  1982;  Cohen,  1982;  Kyburg,  1983).  One  reason  for  the  controversy  is 
that  while  there  is  agreement  that  "uncertainty”  is  a  crucial  factor  in 
inference,  there  is  much  less  agreement  about  its  meaning  and  measurement  (cf. 
Tver sky  a  Kahneman,  1982).  In  particular,  while  most  psychological  work  on 
inference  has  been  guided  by  a  Bayesian  or  subjectivist  view  of  probability, 
increasing  concerns  have  been  expressed  about  this  position  (e.g.,  Cohen, 

1977;  Shafer,  1978).  Central  to  the  Bayesian  view  is  the  idea  that  prob¬ 
ability,  which  is  a  measure  of  one's  degree  of  belief,  can  be  operationalized 
via  choices  amongst  gambles  (Savage,  1954).  Thus,  if  two  gambles  have 
identical  payoffs  but  one  is  preferred  to  the  other,  it  follows  that  the 
probability  of  winning  is  greater  for  the  chosen  alternative. 

The  subjectivist  view  of  probability  gains  much  of  its  force  by  making 
expressions  of  uncertainty  operational  via  choices  amongst  gambles.  However, 
whereas  probability  is  thereby  defined  precisely,  does  this  procedure  capture 
the  essential  psychological  aspects  of  uncertainty?  In  particular,  how  valid 
is  the  assumption  that  expressions  of  uncertainty  can  be  captured  through 
choices  amongst  gambles?  An  important  and  direct  attack  on  this  assumption 
was  put  forward  by  Daniel  Ellsberg  (1961)  and  we  examine  his  arguments  below. 
In  doing  so,  however,  we  stress  that  our  intent  is  to  understand  the  psycho¬ 
logical  bases  of  uncertainty  rather  than  to  critique  the  normative  status  of 
the  Bayesian  position. 


Bllsberg  (1961)  used  the  following  example  to  show  that  the  uncertainty 


people  experience  has  several  aspects,  one  of  which  is  not  captured  in  the 
usual  betting  paradigm;  Imagine  two  urns,  each  containing  red  and  black 
balls.  In  urn  1,  there  are  100  balls  but  the  proportions  of  red  and  blade  are 
unknown;  urn  2  contains  50  red  and  50  black  balls.  Now  consider  a  gamble  such 
that,  if  you  bet  on  red  and  it  is  drawn  from  the  urn  you  get  $100;  similarly 
for  black.  However,  if  you  bet  on  the  wrong  color,  the  payoff  is  $0.  Imagine 
having  to  decide  which  color  to  bet  on  if  a  ball  is  to  be  drawn  from  urn  1 ; 
i.e.,  the  choices  are  red  (R.,),  black  (B^ ),  or  indifference  (z).  What  about 
the  same  choices  in  urn  2;  (Rj),  <b2),  or  (I)?  Most  people  are  indifferent  in 
both  cases,  suggesting  that  the  subjective  probability  of  red  in  urn  1  is  the 
same  as  the  known  proportion  in  urn  2- -namely  .5.  However,  would  you  be 
indifferent  to  betting  on  red  if  urn  1  were  to  be  used  vs.  betting  on  red 
using  urn  2  (R1  vs.  R2)?  Similarly,  what  about  vs.  B2?  Many  people  find 
that  they  prefer  R2  over  R1  even  though  their  indifference  judgments  within 
both  urns  imply  that,  p(R.,)  -  p(R2)  -  .5.  Furthermore,  the  same  person  Who 
prefers  R2  over  Ri  may  also  prefer  B2  over  Bj •  B»is  pattern  of  responses  is 
inconsistent  with  the  idea  that  even  a  rank  order  of  probabilities  can  be 
inferred  from  choices.  Thus,  if  R2  is  preferred  over  R^ ,  this  implies 
that  p(R2)  >  p(R1 ) .  Moreover,  since  red  and  blade  are  complementary  events, 
this  means  that  p(B2>  <  p(B^).  However,  if  B2  is  preferred  over  Bj ,  then 
p(B2)  >  pCBj),  which  contradicts  the  preceding  inequality.  It  is  also  impor¬ 
tant  to  note  that  if  p(R2)  >  p(R^)  and  p(B2)  >  p(B.,),  then  either  urn  2 
has  complementary  probabilities  summing  to  more  than  1  (super-additivity),  or, 
urn  1  has  complementary  probabilities  summing  to  less  than  1  (sub-additivity). 
Although  Bllsberg  did  not  specifically  discuss  the  non-additivity  of  comple¬ 
mentary  probabilities  (cf.  Fellner,  1961),  we  shall  show  that  it  is  intimately 


related  to  th«  offsets  of  different  types  of  uncertainty  on  probabilistic 
judgments. 

From  our  perspective,  the  importance  of  Ellsberg's  paradox  lies  in  the 
difference  in  the  nature  of  the  uncertainty  between  urns  1  and  2.  Zn  urn  1, 
whereas  one's  best  estimate  of  the  proportion  may  be  .5,  confidence  in  that 
estimate  is  low.  Xn  urn  2,  on  the  other  hand,  one  is  at  least  certain  about 
the  uncertainty  in  the  urn.  While  it  may  seem  strange,  and  even  awkward,  to 
speak  of  uncertainty  as  being  more  or  less  certain  itself,  such  a  concept 
captures  an  important  aspect  of  how  people  sake  inferences  from  unknown,  or 
only  partially  known,  generating  processes.  Indeed,  the  idea  of  uncertainty 
about  uncertainty  has  been  considered  from  time-to-time  under  the  rubrics, 
"second -order"  uncertainty  and  probabilities  for  probabilities  (e.g. , 

Marschak,  1975).  However,  whereas  this  concept  has  received  little  support 
amongst  subjectivist  statisticians  (see  e.g.,  de  Pinetti,  1977),  its  status  as 
a  psychological  factor  of  importance  for  understanding  choice  and  inference 
has  been  demonstrated  experimentally  (Becker  6  Brownson,  1964;  Tates  £ 
Zukowski,  1976).  On  the  other  hand,  the  process  by  which  such  second-order 
uncertainty  is  used  in  inference  and  the  factors  that  affect  its  use,  have  not 
been  systematically  studied.  To  be  sure,  Bllsberg  suggested  a  number  of  vari¬ 
ables  that  should  affect  the  "ambiguity"  of  a  situation,  including  the  amount, 
type,  reliability,  and  degree  of  conflict  in  the  available  information. 

Indeed,  he  stated. 

Ambiguity  is  a  subjective  variable,  but  it  should  be  possible 
to  identify  'objectively*  some  situations  likely  to  present 
high  ambiguity,  by  noting  situations  where  available  informa¬ 
tion  is  scanty  or  obviously  unreliable  or  highly  conflicting; 
or  where  expressed  expectations  of  different  individuals  differ 
widely;  or  where  expressed  confidence  in  estimates  tends  to  be 
low.  thus,  as  compared  with  the  effects  of  familiar  production 
decisions  or  well-known  random  processes  (like  coin-flipping  or 
roulette),  the  results  of  Research  and  Development,  or  the  per¬ 
formance  of  a  new  President,  or  the  tactics  of  an  unfamiliar 


opponent  are  all  likely  to  appear  ambiguous.  (1961,  pp.  660- 
661). 

To  specify  the  concept  of  ambiguity  more  precisely,  reconsider  the  urn 
where  the  proportion  of  red  and  black  balls  is  unknown.  From  a  Bayesian 
perspective,  this  situation  can  be  thought  of  as  one  in  which  the  judge  has  a 
diffuse  prior  over  all  possible  values  of  the  proportion,  p(R).  However, 
imagine  that  one  sampled  four  balls  (without  replacement)  and  got  3  red  and  1 
black.  Note  that  this  result  rules  out  certain  values  of  p(R)  and  could 
change  one's  assessment  of  other  values  of  p(R).  Furthermore,  as  the  sample 
size  increases,  one  should  become  more  sure  as  to  the  actual  value  of  p(R) . 
Therefore,  as  information  increases,  ignorance  (a  uniform  distribution),  gives 
way  to  ambiguity  (a  non-uniform  distribution  over  all  outcomes),  which  then 
reduces  to  a  known  p(R).  However,  while  it  is  tempting  to  equate  ambiguity 
with  some  statistical  measure  of  the  dispersion  of  the  subjective  distribu¬ 
tion,  this  is  unsatisfactory  for  the  following  reason:  consider  an  urn  that 
contains  either  all  red  or  all  black  balls  but  you  don't  know  which.  In  such 
a  case  we  can  characterize  the  distribution  over  p(R)  as  having  half  its 
mass  at  zero  and  half  at  one.  Note  that  the  variance  or  range  of  this 
distribution  is  high,  yet,  ambiguity  is  low.  The  reason  is  that  such  a 
distribution  rules  out  all  values  of  p(R)  other  than  0  or  1  and  is  thus 
close  to  the  case  where  ambiguity  doesn't  exist  (as  in  urn  2).  Therefore,  in 
accord  with  its  dictionary  definition,  "having  two  or  more  possible  meanings,” 
ambiguity  is  a  function  of  the  number  of  alternative  parameter  values  that  are 
not  ruled  out  (or  made  implausible)  by  one's  knowledge  of  the  situation.  Note 
that  this  definition  is  similar  to,  but  not  identical  with,  statistical 
measures  such  as  variance,  range,  and  the  like. 

It  is  Important  to  note  that  sample  size  is  only  one  factor  that 
Influences  ambiguity  since  other  information  can  affect  the  probability 


distribution  ovsr  ths  parameter  of  a  stochastic  procsss.  Thus,  imagine  an  urn 
factory  where  employees  color  balls  by  throwing  them  at  two  adjacent  cans  of 
black  and  red  paint  from  a  distance  of  20  feet.  Given  our  knowledge  of  this 
process,  it  seems  fair  to  expect  that  an  urn  of  100  balls  would  not  contain 

«e 

extreme  proportions  of  red  or  black.  A  second  example,  due  to  Gardenfors  and 

Sahlin  (1982),  is  particularly  illuminating  on  this  issue: 

...  consider  Miss  Julie  who  is  invited  to  bet  on  the  outcome 
of  three  different  tennis  matches.  As  regards  match  A,  she  is 
very  well-informed  about  the  two  players  ....  Miss  Julie 
predicts  that  it  will  be  a  very  even  match  and  a  mere  chance 
will  determine  the  winner.  Zn  match  B,  she  knows  nothing  what¬ 
soever  about  the  relative  strength  of  the  contestants  ...  and 
has  no  other  information  that  is  relevant  for  predicting  the 
winner  of  the  match.  Match  C  is  similar  to  match  B  except  that 
Miss  Julie  has  happened  to  hear  that  one  of  the  contestants  is 
an  excellent  tennis  player,  although  she  does  not  know  anything 
about  which  player  it  is,  and  that  the  second  player  is  indeed 
an  amateur  so  that  everybody  considers  the  outcome  of  the  match 
a  foregone  conclusion,  (pp.  361-362). 

Mote  that  the  amount  and  type  of  information  in  the  three  situations  is  quite 
different,  as  is  the  amount  of  ambiguity  (we  would  argue  that  match  A  has  the 
least  ambiguity  and  match  B  the  most).  From  our  perspective,  how  does  the 
amount  and  type  of  ambiguity  affect  judgments  of  the  probability  of  winning  or 
losing  the  match?  Mould  Miss  Julie,  for  example,  judge  that  each  player  in 
the  three  matches  has  a  .5  chance  of  winning  (or  losing)? 

Our  discussion  so  far  has  implied  that  ambiguity  is  generally  avoided 
since  it  adds  to  the  total  uncertainty  of  a  situation.  Indeed,  this  is 
explicitly  mentioned  by  Sllsberg  (1961,  p.  666)  in  discussing  why  new 
technologies  will  be  resisted  more  than  one  would  expect  on  the  basis  of  their 
first-order  probabilities.  However,  this  picture  is  not  completely  accurate, 
as  is  made  clear  by  another  Ellsberg  example  (as  quoted  in  Becker  6  Brownson, 
1964,  pp.  63-4,  footnote  4):  consider  two  urns  with  1000  balls  each.  Zn  urn 
Z,  each  ball  is  numbered  from  1  to  1000  and  the  probability  of  drawing  any 


number  Is  .001.  In  urn  II,  thsrs  ars  an  unknown  number  of  balls  bearing  any 
single  number.  Thus,  there  may  be  1000  balls  with  number  687,  no  balls  with 
this  number,  or  anything  in  between.  If  there  is  a  prize  for  drawing  number 
687  from  the  urn,  would  you  prefer  to  draw  from  urn  x  or  urn  II?  Note  that 
urn  X  has  no  ambiguity  and  each  numbered  ball  has  the  same  .001  chance  of 
being  drawn.  Urn  XX,  on  the  other  hand,  can  be  characterized  as  inducing 
extreme  ambiguity  (i.e.,  ignorance).  However,  for  many  people,  the  drawing 
from  urn  XX  seems  considerably  more  attractive  than  from  urn  X,  thereby 
implying  that  there  are  situations  in  which  ambiguity  is  preferred  rather  than 
avoided.  This  is  considered  in  detail  later,  but  we  note  here  that  accounting 
for  such  shifts  is  an  important  criterion  for  judging  the  adequacy  of  any 
theory  of  inference  under  ambiguity. 

Finally,  the  concepts  of  ambiguity,  second-order  uncertainty,  and  the 

like,  have  been  of  concern  in  theories  of  inference  quite  apart  from  their 

role  in  affecting  choice.  For  example,  work  on  fuzzy  sets  {Zadeh,  1978), 

Shafer's  theory  of  evidence  (1976),  Cohen's  (1977)  attempt  to  formalize 

uncertainty  in  legal  settings,  and  the  elicitation  of  probability  ranges 

(Wallsten,  Forsyth,  c  Budescu,  1983),  all  contain  ideas  concerning  the 

vagueness  that  can  underly  probabilities.  Indeed,  statisticians  have  provided 

axiomatic  systems  for  trying  to  formalize  probability  ranges  and  rank  orders 

rather  than  specific  values  (e.g.,  Koopman,  1940).  Moreover,  early  work  by 

Keynes  (1921)  also  addressed  the  notion  of  ambiguity  by  distinguishing  between 

probability  and  what  he  called  the  "weight  of  evidence."  Keynes  stated: 

The  magnitude  of  the  probability.  .  .depends  upon  a  balance 
between  what  may  be  termed  the  favourable  and  the  unfavourable 
evidence;  a  new  piece  of  evidence  which  leaves  this  balance 
unchanged,  also  leaves  the  probability  of  the  argument  unchanged. 

But  it  seen  that  there  may  be  another  respect  in  which  some  kind 
of  quantitative  comparison  between  arguments  is  possible.  This 
comparison  turns  upon  a  balance,  not  between  the  favourable  and 
unfavourable  evidence,  but  between  the  absolute  amounts  of 


8 


relevant  knowledge  end  of  relevant  ignorance  respectively. 

(Keynes,  1921,  p.  71,  original  eaphasis). 

Plan  of  the  Paper 

We  first  present  a  descriptive  model  of  how  people  make  probability  judg¬ 
ments  and  choices  under  varying  amounts  of  ambiguity.  We  require  that  our 
model  be  able  toi  (1)  Explain  the  pattern  of  choices  elicited  by  Ellsberg's 
problems*  This,  in  turn,  implies  that  the  model  account  for  sub-end  super¬ 
additivity  of  the  probabilities  of  complementary  events;  (2)  specify  the 
conditions  under  which  people  will  seek  as  well  as  avoid  ambiguity;  (3)  allow 
for  individual  differences;  and  (4)  be  empirically  testable.  To  meet  e 
criteria,  the  model  is  tested  in  four  experiments  at  both  the  aggrer  and 
individual  subject  levels.  Two  of  the  experiments  concern  inferenc  t  a, 
and  two  involve  choices.  The  implications  of  the  theory  and  empirical  work 
are  then  discussed  in  relation  tot  (a)  the  importance  of  ambiguity  in 
assessing  perceived  uncertainty;  (b)  the  use  of  cognitive  strategies  in 
understanding  probabilistic  judgments  under  ambiguity;  (c)  the  role  of 
ambiguity  in  risky  choice;  and  (d)  extensions  of  the  model  to  multiple  sources 
and  time  periods. 

A  Descriptive  Model 

Our  model  postulates  an  "anchoring  and  adjustment"  strategy  for  assessing 
probabilities.  This  involves  an  initial  estimate,  denoted  pA,  and  an 
adjustment  to  reflect  the  ambiguity  in  the  situation.  Thus,  the  ensuing 
judgment,  SCp^),  is  given  by, 

S(pA)  -  pj^  +  k  (1) 

where  k  is  the  net  effect  of  the  adjustment  process.  To  model  the  adjust¬ 
ment  process,  we  propose  that  people  engage  in  a  mental  simulation  in  which 


other  values  of  p  are  considered  by  imagining  how  well  they  express  one's 
uncertainty.  These  simulated  values  are  then  incorporated  into  the  adjustment 
term.  The  rationale  for  the  simulation  is  that  in  ambiguous  situations,  p 
can  be  any  one  of  a  number  of  values.  By  incorporating  the  range  of  possible 
values  of  p  into  their  judgments,  people  can  maintain  sensitivity  to  both 
uncertainty  and  ambiguity. 

Me  further  argue  that  k,  the  net  effect  of  the  simulation,  will  be 

affected  by  three  factors:  (1)  The  level  of  pA;  that  is,  since  S(pA) 

varies  between  0  and  1,  equation  (1)  implies  that  -p.  <  k<  O-p.  )•  This 

A  A 

means  that  the  direction  of  the  adjustment  must  be  due,  in  part,  to  the  value 
of  p^.  Indeed,  when  p^  »  0,  k  >  0,  and  the  adjustment  (if  there  is  one) 
must  be  upward;  when  pA  *•  1,  k  <  0,  so  that  the  adjustment  must  be  downward; 
when  p^  *  0,  1,  adjustments  can  be  up  or  down;  (2)  the  amount  of  ambiguity 
perceived  in  the  situation.  Me  denote  this  by  a  parameter  3,  which 
determines  the  absolute  size  of  the  adjustment,  i.e.,  the  larger  3,  the 
larger  the  adjustment;  (3)  the  person's  attitude  toward  ambiguity  in  the 
circumstances.  This  is  reflected  in  the  tendency  to  give  differential 
attention  or  weight  to  values  of  p  that  are  greater  or  smaller  than  the 
initial  estimate,  p^.  One's  attitude  toward  ambiguity  is  denoted  by  0,  and 
this  parameter,  together  with  pA,  determines  the  sign  of  the  net  effect  of 
the  adjustment,  i.e.,  when  k  is  positive  or  negative. 

To  model  the  adjustment  process  algebraically,  let 

l 

k  -  kg  -  ks  (2) 

where  kg  denotes  the  effect  of  imagining  values  of  p  greater  than  the 
initial  estimate,  and  ka  the  effect  of  imagining  jmaller  values.  How  does 
perceived  ambiguity  affects  these  quantities?  To  answer  this,  consider  Figure 
1 ,  which  shews  the  position  of  pj^  relative  to  the  end  points  of  0  and  1 . 
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Mote  that  tha  maximum  upward  adjustment  to  pA  is  (1  -  pA)  and  the  maximum 
downward  adjustment  is  p^»  If  wa  restrict  the  range  of  @  to  the  unit 
interval  (0  <  0  <  1),  then  maximum  adjustments  would  occur  under  complete 
ambiguity  (0  «  1),  and  zero  adjustments  under  no  ambiguity  (0  -  0) .  This 
suggests  that  the  effects  of  simulating  values  greater  and  smaller  than  pA 
0c?  and  ks,  respectively),  can  be  represented  as  proportions  of  the  maximum 
adjustments  where  0  is  the  constant  of  proportionality,  i.e.. 


(3a) 

and 

\  "  e*K 

(3b) 

The  development  so  far  ignores  the  possibility  that  greater  and  smaller 
values  than  pA  could  be  differentially  weighted.  For  example,  in  estimating 
the  chance  of  an  accident  in  a  new  technology  (high  ambiguity),  one  may  start 
with  the  estimate  offered  by  the  engineering  department  and  then  weight  larger 
values  of  p(accident)  more  than  smaller  ones.  To  account  for  differential 
weighting  effects,  we  need  only  weight  either  k_  or  k_  to  affect  k.  Por 
convenience,  we  weight  k_  (rather  than  k  )  by  0  as  follows, 

a  g 

ks  -  0p®  (S  >  0)  (4) 

Thus,  the  net  effect  of  the  adjustment  process  is  given  by. 


When  (5)  is  substituted  into  (1),  the  full  model  becomes, 


s(pA)  -  pA  *  9(i-pa-pa) 


(6) 


Nt  uk« 


1  points  with  rsspcct  to  equation  (6).  First,  note  that 


6  affects  the  absolute  sisa  of  the  adjustment  factor.  That  is,  when  there  is 
no  ambiguity,  0-0  and  S(pA)  -  p^.  Thus,  9  can  be  thought  of  as  having 
a  magnifying  or  dampening  effect  on  one's  attitude  toward  ambiguity  in  the 
circumstances,  (0).  For  example,  if  perceived  ambiguity  is  small,  the 
tendency  to  weight  differentially  values  of  p  above  and  below  pA  is  of 
little  consequence. 

Second,  S(p)  is  regressive  with  respect  to  p.  This  can  be  illustrated 
by  considering  the  effects  of  different  values  of  0  in  equation  (6). 
Specifically,  the  three  panels  of  Figure  2  illustrate  "ambiguity  functions" 
with  0  <  1 ,  0  >  1 ,  and  0  -  1 .  (9  is  shown  to  be  the  same  in  all  three 
cases. )  Xt  is  important  to  note  that  each  value  of  0  defines  a  unique 

insert-Figura_2_about~here 

•cross-over"  point,  pc,  where  S(p)  «  p.  Thus,  in  Figure  2a,  0  defines 

pc1  such  that  small  probabilities  are  overweighted  and  larger  probabilities 
underweighted.  This  form  of  the  function  results  because  0  <  1  implies  that 
more  weight  is  given  to  smaller  values  of  p  rather  than  larger  ones.  There¬ 
fore,  It  <  0  over  most  of  the  range  of  p.  However,  when  pa  <  pc1,  there 
are  few  smaller  values  of  p  to  consider  relative  to  larger  ones.  Thus,  even 
when  smaller  values  are  weighted  more  heavily  than  larger  ones,  there  are  more 
of  the  latter  and  It  >  0.  Conversely,  when  0  >  1,  as  shown  in  Figure  2b, 
S(p)  >  p  over  most  of  the  range  of  p  since  more  weight  is  given  to  larger 
as  opposed  to  smaller  p's.  However,  when  pA  >  pc2,  S(p)  <  p  since  there 
are  few  larger  values  and  k  <  0.  Finally,  note  that  in  Figure  2c,  when  0 
-  1,  the  cross-over  point  is  at  .5. 

Third,  equation  (6)  implies  the  conditions  under  which  probability 
judgments  of  complementary  events  are  additive  (sum  to  one).  Specifically, 
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consider  tha  sub  of  S(p^)  and  S(1-pA).  This  is, 

s(pA)  +  so-pA)  -  pa  +  ©<i-pa-pa)  +  n-pA)  +  0[i-m-pa)  -  ^1-pa)8] 

,  «  a  (7> 

-  1  +  ®[1_PA  *  1 1 ~PA '  ] 

Thus,  conplanantary  probabilities  ara  additive  if:  0-0,  or  0  ■  1,  or  pA 
■  0,  1 t  otherwise,  there  is  sub-additivity  if  0  <  1,  and  super-additivity  if 
0  >  1. 

Fourth,  there  are  many  ways  we  could  have  chosen  to  incorporate  the  0 
parameter  in  the  model.  However,  not  all  forms  have  the  same  implications, 
particularly  with  respect  to  the  additivity  of  complementary  probabilities. 

We  consider  several  alternative  models  in  Appendix  A. 

To  summarize,  the  model  has  two  parameters  and  both  are  functions  of 
individual  and  situational  factors.  The  3  parameter  reflects  perceived 
ambiguity  and  the  degree  to  which  one  simulates  values  of  p  that  "might 
be."  However,  situational  factors  are  also  likely  to  affect  0  (across 
people);  e.g.,  the  absolute  amount  of  evidence  available,  the  unreliability  of 
sources,  lack  of  causal  knowledge  regarding  the  process  generating  outcomes, 
and  so  on.  The  0  parameter  reflects  the  extent  to  which  one  differentially 
weights  in  imagination  possible  values  of  p  that  are  smaller  vs.  larger 
than  Pj^.  As  such,  0  may  be  related  to  an  optimism-pessimism  attitude  at 
the  individual  level.  However,  we  argue  that  0  will  also  be  influenced  by 
situational  variables  such  as  the  sign  and  size  of  the  payoffs  that  are 
contingent  on  the  ambiguous  probability.  For  example,  if  the  general  effect 
of  ambiguity  is  to  induce  caution  rather  than  riskiness,  the  prospect  of  an 
undesirable  outcome  (e.g.,  monetary  losses)  would  induce  people  to  pay  more 
attention  in  imagination  to  values  of  p(Loss)  that  are  larger  than  pA; 
similarly,  the  prospect  of  a  gain  would  focus  attention  on  smaller  values  of 
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p(Gain).  we  consider  this  issue  further  in  connection  with  experiments  3  and  4 
We  now  consider  how  the  model  in  (6)  explains  Ellsberg's  original  results. 


Rote  Figure  2a,  where  0  >  0  and  6  <  1.  A  person  with  parameter  values  in 
these  ranges  will  "underweight”  all  pA  above  pc,  and  "overweight"  pA  < 
pc.  This  particular  pattern  explains  why  most  people  in  Ellsberg’s  urn 
example  avoid  the  ambiguous  urn  1)  that  is,  Sfp^  -  .5)  <  .50.  However,  note 
that  if  p^  is  less  than  pc,  S(PA)  >  pA  and  one  would  expect  the  same 
person  who  avoided  the  ambiguous  urn  when  pA  «  .5,  to  prefer  the  ambiguous 
urn  when  p^  is  sufficiently  low  (e.g.,  when  p^  -  .001).  The  pattern  of 
overweighting  small  pA  and  underweighting  moderate-to-large  pA  also 
accounts  for  some  otherwise  puzsling  results  of  Goldsmith  and  Sahlin  (as 
reported  in  Gardenfors  s  Sahlin,  1982).  They  presented  subjects  with 
descriptions  of  either  well-known  events  (e.g.,  drawing  cards  from  a  standard 
deck),  or  events  about  which  the  subjects  had  little  knowledge  (e.g.,  the 
likelihood  of  a  bus  strike  in  Verona,  Italy  next  week).  Subjects  estimated 
the  probabilities  of  the  events  and  the  perceived  reliability  of  their 
probability  estimates.  Events  with  equal  probabilities  but  unequal  reli¬ 
abilities  were  then  used  in  a  lottery  set-up.  The  authors  report  that, 

...  for  probabilities  other  than  fairly  low  ones,  lottery 
tickets  involving  more  reliable  probability  estimates  tend  to  be 
preferred.  (Gardenfors  fi  Sahlin,  1982,  p.  363,  our  enphasis.) 

While  the  pattern  shown  in  Figure  2a  accounts  for  much  data,  it  does  not 
explain  why  some  people  in  the  Ellsberg  task  prefer  to  bet  on  drawing  from  the 
ambiguous  urn  when  pA  -  .5.  However,  consider  a  person  with  an  S(pA) 
function  as  shown  in  Figure  2b.  When  0  >  0  and  8  >  1 ,  one  gets  "ambiguity 
preference”  over  most  of  the  range  of  pA.  Thus,  when  pA  <  pc,  S(pA)  >  pA 
and  overweighting  occurs;  when  pA  >  pc,  S(pA)  <  pA  and  underweighting 
occurs.  Since  individual  differences  are  rarely  accounted  for  in  research  on 


decision*  under  uncertainty,  our  model  has  the  distinct  advantage  of  positing 
a  general  psychological  process  while  allowing  for  individual  differences  via 
particular  parameter  values.  Indeed,  this  is  nicely  illustrated  by  consider¬ 
ing  people  who  are  indifferent  between  gambles  from  ambiguous  and  unambiguous 
urns  when  pA  -  .5  (as  in  the  Bllsberg  case).  Our  model  suggests  two 
distinct  types:  those  for  whom  6-0,  and  thus,  S(pA)  -  pA;  and  those 
for  whom  0  >  0  and  6-1  (shown  in  Figure  2c).  This  latter  group  does  not 
adjust  at  »  .5,  but  does  adjust  at  all  other  values.  Therefore,  people 
characterized  by  these  parameter  values  will  only  be  indifferent  between 
lotteries  at  .5. 

Finally,  we  note  that  our  model  is  relevant  to  a  major  psychological 
theory  of  risk;  namely,  "prospect  theory”  (Kahneman  &  Tversky,  1979).  From 
our  perspective,  the  treatment  of  uncertainty  in  prospect  theory  is  consistent 
with  our  approach  since  a  decision-weight  function  is  posited  that  is  remark¬ 
ably  similar  to  the  S(pA)  function  shown  in  Figure  2a.  This  is  not  a 
coincidence  since,  as  Kahneman  and  Tversky  specifically  point  out,  decision 
weights  can  be  affected  by  ambiguity.  Indeed,  they  state. 

The  decision  weight  associated  with  an  event  will  depend 
primarily  on  the  perceived  likelihood  of  that  event,  which  could 
be  subject  to  major  biases.  In  addition,  decision  weights  may  be 
affected  by  other  considerations,  such  as  ambiguity  or 
vagueness.  Indeed,  the  work  of  Ellsberg  and  Fellner  implies  that 
vagueness  reduces  decision  weights,  (p.  289) 

While  our  equation  (6)  could  be  made  fully  compatible  with  the  decision-weight 

function  of  prospect  theory  (by  restricting  its  applicability  to  0  <  p  <  1 

and  thereby  not  defining  the  end  points),1  we  wish  to  emphasize  that  (6) 

expresses  a  class  of  functions.  Therefore,  while  the  decision-weight  function 

of  prospect  theory  expresses  a  general  tendency  to  treat  uncertainty  in  a 

particular  way,  (6)  allows  for  both  situational  variables  and  individual 

differences  in  the  handling  of  uncertainty. 


EXPERIMENTAL  TESTS  OF  THE  MODEL 

To  test  our  model  empirically,  we  employed  two  tasks  that  focused  on 
inference  (experiments  1-2)  and  two  dealing  with  choice  (experiments  3-4).  In 
the  inference  task,  people  were  asked  to  make  probability  judgments  on  the 
basis  of  numbers  of  reports  from  a  source.  In  experiment  1 ,  we  examined  the 
various  implications  of  equation  (6).  In  experiment  2,  we  used  different 
scenarios  to  manipulate  0  in  both  a  between  and  within-subjects  design.  In 
addition,  the  consistency  of  individual  differences  in  strategy  (as  measured 
by  a  person's  6  and  8  parameters)  was  also  considered.  Experiment  3 
involved  an  attempt  to  answer  the  question:  Can  an  individual's  choices 
between  gambles  be  predicted  from  knowledge  of  his  or  her  0  and  8 
parameters  obtained  from  a  separate  inference  task?  Finally,  in  experiment  4, 
people  were  asked  to  be  either  buyers  or  sellers  of  insurance  in  ambiguous  and 
non-ambiguous  situations.  Differences  between  buying  and  selling  prices  were 
then  investigated  as  a  function  of  assumed  differences  in  8  parameters. 

Since  experiments  1-3  are  all  based  on  the  same  type  of  inference  task,  we 
first  explicate  the  underlying  nature  of  this  task,  noting  how  it  differs  from 
other  probabilistic  tasks  considered  in  the  literature. 

A  Model  for  Studying  Ambiguity  in  Inference 

The  prototypical  Inference  that  we  consider  involves  a  judge  assessing 
the  likelihood  of  the  occurrence  of  an  event  based  on  reports  received  from 
a  source  of  limited  reliability.  The  task  can  be  thought  of  as  having  the 
elements  schematically  represented  in  Figure  3.  (1)  An  event  occurs; 

(2)  The  event  is  "sensed"  by  observers  (e.g.,  witnesses  to  an  accident)  who, 
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gure  3.  Structure  of  the  Inference  tea 


in  principle,  can  ba  characterized  by  levels  of  sensitivity  and  bias.  How¬ 
ever,  it  is  important  to  emphasize  that  these  levels  are  unknown  to  the  judge 
(see  5  below);  (3)  The  observers  report  what  they  saw.  He  denote  A*  as  the 
report  of  event  A,  and  B*  as  the  report  of  event  B,  where  the  decision  rule 
is  to  report  A*  if  the  observation  is  above  some  critical  value  Xc,  and 
8*  otherwise.  The  reports  can  therefore  be  conceptualized  as  coming  from  a 
signal-detection  task;  (4)  Since  there  are  n  observers,  n  reports  are 
collected.  Thus,  the  n  reports  can  be  thought  of  as  the  outcomes  of  n 
observers  reporting  on  a  single  trial  of  a  signal  detection  task.  Further¬ 
more,  since  we  do  not  differentiate  between  the  n  observers,  we  refer  to 
them  as  coming  from  a  single  source;  (5)  The  judge  receives  the  information  in 
the  form  of  f  reports  for  a  hypothesis  (i.e.,  f  reports  of  A*)  and  c 
reports  of  an  alternative  (i.e.,  c  reports  of  B*),  where  f+c  «  n,  and  p  - 
f/n.  The  content  of  the  scenario,  however,  is  assumed  to  give  the  judge  some 
information  as  to  what  values  of  p  to  expect  in  a  sample  of  size  n. 
Specifically,  we  argue  that  expectations  concerning  p  will  be  influenced  by, 
(a)  the  dissimilarity  between  events  A  and  B;  and  (b)  the  credibility  of  the 
source.  By  "credibility”  we  mean  the  sensitivity  and  response  bias  of  the 
observers  in  judging  the  particular  events  of  Interest.  For  example,  imagine 
that  you  are  a  detective  investigating  a  bank  robbery  where  two  witnesses 
claim  that  the  robber  has  blond  hair  and  one  witness  claims  it  is  brown.  How 
likely  does  the  robber  have  blond  hair?  Hhile  the  detective  knows  neither  the 
hit  and  false  alarm  rates  of  the  witnesses,  nor  their  response  bias  for  saying 
"blond”  vs.  "brown,"  he  may  know  something  about  the  quality  of  eye-witnesses 
in  a  robbery,  the  confusability  of  blond  and  brown  in  the  circumstances,  and 
perhaps  something  about  the  motivation  of  the  witnesses.  Now  contrast  this 
situation  where  the  source  is  two  color  television  cameras  that  were  filming 
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the  robbery  at  the  bank.  Whereas  in  the  former  case  the  detective  would 
expect  the  reports  to  conflict  (i .e. ,  0  <  p  <  1),  in  the  latter  it  would  be 
surprising  if  p  were  not  equal  to  either  0  or  1. 

Mote  that  in  Figure  3,  we  have  represented  the  judge's  expectations  by 
three  different  distributions.  In  distribution  (1),  the  information  about  the 
credibility  of  the  source,  the  dissimilarity  of  the  signals,  and  the  size  of 
the  sample,  does  not  rule  out  many  values  of  p.  This  is  a  highly  ambiguous 
situation  and  would,  for  example,  characterize  the  detective  trying  to  judge 
evidence  from  witnesses.  Distribution  (2)  characterizes  expectations  based  on 
a  highly  credible  source  that  discriminates  between  dissimilar  signals;  e.g. , 
evidence  from  cameras  filming  the  robbery.  We  believe  that  ambiguity  is  low 
here  since  our  knowledge  of  the  process  that  generates  evidence  rules  out  most 
values  of  p.  Distribution  (3)  also  represents  a  situation  of  low  ambiguity, 
but  it  is  quite  different  from  (2).  Indeed,  (3)  is  likely  to  result  when  the 
credibility  of  the  source  is  particularly  low  and/or  the  signals  are  very 
similar,  in  direct  opposition  to  the  conditions  that  produce  (2).  For 
example,  imagine  a  taste-test  between  Pepsi  vs.  Coke  for  randomly  chosen 
shoppers.  If  we  believe  that  the  two  drinks  have  a  very  similar  taste  and 
that  most  shoppers  are  not  able  to  tell  the  difference,  we  would  expect  the 
proportion  of  reports  for  either  product  to  be  around  .5.  Thus,  results  from 
such  a  test  might  be  seen  as  most  closely  resembling  the  drawing  of  balls  from 
an  urn  with  known  p  -  .3.  It  is  interesting  to  note  that  whereas  some 
authors  have  equated  increased  reliability  of  evidence  with  less  ambiguity  (as 
suggested  by  Ellsberg,  for  example),  distribution  (3)  shows  that  decreased 
reliability  can  also  lead  to  low  ambiguity.  Another  way  to  express  this  is  to 
note  that  high  reliability  implies  low  ambiguity  (distribution  (2)),  but  low 
ambiguity  does  not  imply  high  reliability  (since  distribution  (3)  could  be 


involved)}  (6)  The  judge  combines  the  information  from  the  reports  with 
expectations  about  p  to  reach  an  assessaent  of  the  likelihood  of  A. 

The  structure  of  this  task  is  both  siailar  to  and  different  froa  several 
probabilistic  sods Is  of  the  inference  process.  First,  it  is  siailar  to 
cascaded  Inference  in  that  the  judge  is  asking  inferences  about  an  event 
on  the  basis  of  unreliable  reports  (cf.  Schua  G  Kelley,  1973;  Schua,  1980). 
However,  in  contrast  to  studies  of  cascaded  inference,  the  judge  does  not  know 
the  precise  value  of  the  source's  reliability;  rather,  there  is  aabiguity 
concerning  what  this  is. 

Second,  since  each  observer  can  be  thought  of  as  participating  in  the 
saae  signal  detection  task,  the  reports  not  only  reflect  their  sensitivity  to 
competing  signals,  but  also  their  bias  due  to  differential  payoffs.  However, 
as  recently  emphasized  by  Bimbaua  (1983),  the  aanner  in  which  the  judge 
treats  the  observer  reports  depends  on  soas  theory  about  the  observers.  For 
exaaple,  the  observer  reports  could  be  responsive  to  the  prior  probabilities 
of  A  and  B  as  well  as  to  differential  payoffs.  US  eaphasise  that  in  our  task 
the  judge  is  not  given  precise  information  about  these  matters.  Furthermore, 
since  the  judge  only  receives  information  on  a  single  trial,  the  observers’ 
hit-rate  and  false-alarm  rate  are  not  known.  Instead,  the  observed  p,  and 
the  judge's  expectations  about  p,  become  cues  to  the  likelihood  that  the 
event  occurred. 

Third,  one  might  consider  our  situation  as  a  conventional  Bayesian 
revision  task  (cf.  Bdwards,  1968).  However,  the  explicit  probabilities 
necessary  to  assess  the  likelihood  functions  are  not  provided;  and,  no  base- 
rate  data  or  prior  probabilities  are  stated.  It  would,  of  course,  be  possible 
to  provide  the  judge  with  explicit  prior  probabilities.  This  would,  however, 
be  extending  our  paradigm  to  one  where  multiple  sources  of  information  need  to 


be  combined,  i.e.,  base-rates  and  individuating  information.  For  the  sake  of 
simplicity,  we  only  consider  the  effects  of  ambiguity  on  inferences  from  a 
single  source  and  thus  do  not  discuss  the  effects  of  explicit  base  rates 
(extensions  of  our  model  to  multiple  sources  is  considered  in  the  Discussion 
section). 

Xn  this  model,  note  that  we  have  explicitly  recognized  three  sources  of 
ambiguity,  viz:  (a)  the  dissimilarity  between  events  A  and  B;  (b)  the  credi¬ 
bility  of  the  source;  and  (c)  the  number  of  reports,  or  sample  size,  n. 
Specifically,  when  n  is  small,  one  would  expect  ambiguity  to  be  high; 
however  as  n  increases,  we  would  expect  ambiguity  to  decrease.  Thus,  to 
incorporate  the  effects  of  n  explicitly  in  our  model,  let  9  »  Q'/ n  such 
that, 

S(f‘C)  “  PA  -pA-0  (8> 

where  S(f:c)  -  judged  probability,  and  p^  *  f/n.  That  is,  in  judging  the 
probability  of  an  event  based  on  f  reports  "for”  and  c  "con,"  people  are 
assumed  to  anchor  on  f/n,  and  then  adjust  for  the  unreliability  of  the  source 
and  the  amount  of  data,  the  model  in  equation  (8)  has  several  implications: 
(1)  Consider  the  effect  of  the  amount  of  information  (n)  on  judged  likeli¬ 
hood.  Note  that  S  ♦  p^  as  n  ♦  ».  This  means  that  as  the  amount  of 
information  increases,  one  becomes  more  certain  as  to  the  diagnosticlty  of  the 
data.  It  is  important  to  realize  that  as  n  ♦  •»,  S  does  not  go  to  0  or  1  as 
would  be  implied  by  a  standard  Bayesian  revision  model.  Instead,  the  fact 
that  S  asymptotes  at  p^  parallels  an  analogous  result  in  cascaded 
inference  where,  under  certain  symmetry  assumptions,  the  maximum  probability 
of  a  hypothesis  is  bounded  by  the  reliability  of  the  reporting  source  (Schum  & 


DuCharme,  1971) 


(2)  Conditional  on  a  given  value  of  9',  the  model  implies  that  there 
will  be  trade-offs  between  p  and  n  in  determining  judged  likelihood.  For 
example,  one  might  find  the  evidence  in  favor  of  some  hypothesis  to  be  more 
convincing  on  the  basis  of  (9:1)  than  (2:0).  However,  because  S  asymptotes 
at  p^,  trade-offs  of  p  and  n  will  only  occur  at  small  values  of  n. 

(3)  Since  9  »  Q'/n,  n  also  affects  the  conditions  underlying  the 
additivity  of  complementary  probabilities.  Specifically, 

S(f:c)  +  S (c:f )  -  1  +  -  [l  -  p®  -  fl  -  pj8]  (9) 

n  A  A 

Thus,  in  addition  to  the  additivity  conditions  discussed  in  regard  to  equation 
(7),  as  n  ♦  •,  additivity  will  hold  regardless  of  9",  (5,  or  pft.  Of 

course,  when  n  is  small  (meager  data),  adjustments  will  be  substantial  and 
violations  of  additivity  will  be  most  likely. 

Experiment  1  explicitly  considers  the  role  of  n  in  equation  (8), 
whereas  factors  affecting  Q'  are  the  central  concern  of  experiment  2. 

Experiment  1 

Subjects.  Thirty-two  subjects  were  recruited  through  an  ad  in  the 
University  newspaper  which  offered  $5  an  hour  for  participation  in  an 
experiment  on  judgment.  The  median  age  of  the  subjects  was  24,  their 
educational  level  was  high  (mean  of  4.4  years  of  formal  post-high  school 
education),  and  there  were  16  males  and  16  females. 

Stimuli.  The  stimuli  consisted  of  a  set  of  scenarios  that  involved  a 

hit-and-run  accident  seen  by  varying  numbers  of  witnesses.  Moreover,  of  the 

n  witnesses  to  the  accident,  f  claimed  that  it  was  a  green  car  while  c 

claimed  it  was  a  blue  car.  A  typical  scenario  was  phrased  as  follows: 

An  automobile  accident  occurred  at  a  street  corner  in  down¬ 
town  Chicago.  The  car  that  caused  the  accident  did  not  stop 
but  sped  away  from  the  scene.  Of  the  n  witnesses  to  the 


accident,  f  reported  that  the  color  of  the  offending  car 
was  green,  whereas  c  reported  it  was  blue.  On  the  basis  of 
this  evidence,  hew  likely  is  it  that  the  car  was  green? 

Each  scenario  was  printed  on  a  separate  page  and  contained  a  0-100  point 
rating  scale  that  was  used  by  the  subject  to  judge  how  likely  the  accident  was 
caused  by  a  particular  colored  car.  Each  stimulus  contained  the  same  basic 
story  but  varied  in  the  total  number  of  witnesses  (n),  the  number  saying  it 
was  a  green  (f)  or  a  blue  car  (e),  and  whether  one  was  to  judge  the  like¬ 
lihood  that  the  majority  or  minority  position  was  true.  Zn  order  to  sample  a 
wide  range  of  values  of  n  and  p,  40  combinations  were  chosen  as  follows: 

for  p  -  1,  n  •  2,  6,  12,  20 ;  p  ■  .89,  n  -  9,  18,  27;  p  -  .80,  n  -  5,  10,  IS, 

20,  2S;  p  -  .75,  n  ■  4;  p  ■  .87,  n  -  3,  8,  9,  12,  15,  18,  24;  p  «  .80,  n  -  5, 

10;  p  ■  .50,  n  •  2,  8,  12,  20;  p  -  .40,  n  ■  5,  10;  p  -  .33,  n  ■  6,  9,  18;  p 

»  .25,  n  ■  4;  p  ■  .20,  n  ■  5,  10;  p  <■  .11,  n  -  9,  18;  p  «  0,  n  •  2,  6,  12,  20. 
Zn  addition,  8  stimuli  were  given  twice  to  ascertain  test-retest  reliability. 
Thus,  the  total  number  of  stimuli  was  48,  and  they  were  arranged  in  booklet 
form. 

Procedure 

When  the  subjects  entered  the  laboratory,  they  were  told  that  the 
experiment  involved  making  inferential  judgments.  Furthermore,  it  was  stated 
that  if  they  did  well  in  the  experiment  (without  specifying  what  this  meant) , 
it  was  likely  that  they  would  be  called  for  further  experiments.  Given  the 
relatively  high  hourly  wage,  this  was  thought  to  provide  some  incentive  to 
take  the  task  seriously.  Zn  order  to  avoid  boredom  and  to  reduce  the  trans¬ 
parency  that  judgments  of  complementary  events  were  sometimes  required, 
subjects  were  given  4  sets  of  12  stimuli  and,  after  completing  each  set,  they 
performed  a  different  task.  All  stimuli  were  randomly  ordered  within  the  four 
sets.  Subjects  could  take  as  much  time  as  they  needed  and  they  were  free  to 
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make  as  many  (or  as  few)  calculations  as  they  wished.  After  completing  the 
task,  all  subjects  filled  out  a  questionnaire  regarding  various  demographic 
variables. 


Estimating  the  Model 

To  estimate  the  model  from  the  experimental  data,  we  need  to  re-write 
equation  (8)  and  include  a  random  error  tern  to  represent  judgmental  incon¬ 


sistency!  therefore. 


S(f!C)  -pA  +  n  C1  “  +C 


Equation  (10)  requires  a  non-linear  estimation  technique  which  was  developed 

in  the  following  way:  let  S(f:c)  be  the  actual  response  of  the  subject 
* 

and  S(f:c)  the  predicted  response  from  the  model.  We  wish  to  minimise  some 
loss  function  (we  chose  the  mean  absolute  deviation,  MAD),  by  finding  values 


of  9"  and  0  such  that. 


!|.c.  ic)  -  s(f«c)|  ■  minimum 


This  was  done  by  setting  up  a  grid  of  values  of  8“  and  0  and  writing  a 
computer  program  to  first  compute  the  MAD  for  pairs  of  "coarse"  values  of  0" 
and  0.  Since  certain  ranges  of  9"  and  0  can  thus  be  excluded,  the 
program  then  considers  "finer-grained"  values  until  MAD  is  minimized. ^  The 
output  from  this  analysis  is  a  unique  set  of  values  for  9"  and  0  that 
minimises  the  desired  loss  function. 

Since  the  sampling  distributions  of  9'  and  0  are  not  known,  testing 
the  statistical  significance  of  the  model's  fit  to  the  data  is  problematic. 

A 

We  therefore  adopted  the  strategy  of  comparing  the  accuracy  of  S(f:c)  with 
that  of  a  model  based  solely  on  pA.  Moreover,  since  pA  is  the  anchor  of 

A 

the  assumed  process,  any  difference  between  the  accuracy  of  pA  and  S(f:c) 


can  b«  attributed  to  the  ad  jus  tote  nt  process,  and  thus  to  Q"  and  0.  We 
emphasise  that  this  procedure  is  biased  against  finding  differences  between 

A 

p^  and  S(f:c)  for  two  reasons:  (a)  the  model  predicts  that  S(f:c)  ♦  pft 
as  n  increases.  Thus,  since  we  have  included  some  large  values  of  n  to 
test  this  prediction,  if  S(f:c)  »  p^,  this  counts  against,  rather  than  for, 
the  model;  (b)  the  model  further  predicts  that  S(f:c)  «  pA  at  the  cross-over 
point,  pc,  and  will  be  close  to  p^  in  the  region  of  pc.  Again,  if  this 
occurs,  it  counts  against  the  model.  We  take  this  highly  conservative 
approach  to  guard  against  attributing  random  error  in  the  data  to  an 
adjustment  process. 

Results 

Before  discussing  the  major  results,  recall  that  for  each  subject,  8 
stimuli  were  given  twice  so  that  test-retest  reliability  could  be  assessed. 
This  was  dona  in  two  rays:  (1)  the  correlation  between  judgments  of  the  same 
stimuli,  within  each  subject  (W  •  8),  was  computed.  The  mean  of  these  cor¬ 
relations  was  .93,  with  26  of  the  32  coefficients  greater  than  .90;  (2)  each 
subject  was  considered  a  replication  with  8  responses  and  the  correlation 
between  judgments  for  identical  stimuli,  over  subjects  (N  -  256  -  32  subjects 
x  8  responses),  was  .91.  Clearly,  the  reliability  of  the  judgments  was  high, 
regardless  of  the  computational  method. 

For  a  general  impression  of  how  well  the  model  fits  the  data,  we  first 
consider  an  aggregate  analysis  (individual  differences  will  be  considered  in 
detail  below).  For  each  of  the  48  stimuli,  the  judgments  from  the  32  subjects 
were  averaged  to  form  the  mean  judged  likelihood,  §(f:e).  This  was  then  used 
as  the  dependent  variable  to  be  fit  by  the  model.  The  parameter  values 

A  A 

obtained  from  the  estimation  program  were,  9"  ■  .35,  0  ■  .10  (implying 

A 

p  •  .16).  In  addition,  the  mean  absolute  deviation  of  model  and  data  was 
c 
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.020,  which  Is  significantly  lowsr  than  that  of  tha  basslina  pA -nodal  (MAO  - 
.041;  p  <  .001  using  a  wilcoxon  natched-pairs  signed-ranks  tast). 

To  aaa  whathar  tha  lnplieations  of  tha  nodal  hold,  considar  Tab la  1 , 
which  shore  S(fic)  and  S(ftc)  for  tha  44  stimuli.  First,  doas 

Xnssrt^Tabla'T'ah^jt'hara 

S(f:c)  p?  as  n  incraasas?  Tha  data  strongly  support  this  whan  p^  ■  i, 
.67,  .60,  .50,  .40,  and  0.  At  tha  values  of  .75  and  .25,  n  was  not  varied 
although  tha  larga  adjustments  do  suggast  that  tha  aspactad  affact  would 
occur.  Bowever,  tha  affact  of  n  is  lass  elaar  at  p  ■  .45,  .40,  and  .33 
sinca  thara  is  littla  initial  adjustment  at  snail  n.  Takan  toga tha r,  thasa 
results  suggast  aoderate  support  for  tha  hypothesis.  Second,  do  p  and 
n  trade-off  in  affecting  judged  likelihood?  Tha  evidence  hare  is  quite 
convincing:  e.g.,  note  that  S(4<1)  •  .88  >  8(2:0)  ■  .85,  8(10:5)  -  .65  > 
S(3:1)  -  .63,  S(1:4)  -  .21  >  8(1:3)  •  .20.  Of  particular  interest  is  tha 
result  that  3(0:2)  “  .16  >  8(1:8)  "  .12.  This  naans  that  whan  thara  is 
United  evidence,  no  data  in  favor  of  a  hypothesis  can  be  judged  as  stronger 
evidence  for  that  hypothesis  than  when  core  evidence  is  available  with  nixed 
support.  Third,  an  important  implication  of  tha  nodel  concerns  the  relation 
between  0,  8,  and  the  additivity  of  conpleaentary  probabilities.  Recall 
fron  aquation  (7)  that  whan  0  >  0  and  8  <  1,  sub-additivity  is  predicted 
for  0  <  p^  <  1.  To  tast  this  prediction,  considar  Table  2,  which  shows  both 

A  A 

§(f:c)  +  3(c:f)  and  S(f:c)  +  S(c:f).  Note  that  there  is  substantial  sub¬ 
additivity  and  tha  model  doas  a  reasonably  good  job  of  capturing  it.  In 

judging  tha  performance  of  the  model  in  this  regard,  it  is  useful  to  remember 


that  we  have  gone  beyond  the  qualitative  prediction  that  sub-additivity  will 


TABLE  1 


Fit  of  the  Model  for  Aggregate  Data 


A 

n 

£a 

S 

S 

2 

i 

.85 

.83 

6 

i 

.92 

.94 

12 

i 

.96 

.97 

20 

i 

.95 

.98 

§ 

.89 

.88 

.86 

18 

.89 

.87 

.87 

(18) 

(.89) 

(.85) 

.87 

27 

.89 

.87 

.88 

5 

.80 

.80 

.73 

(  S) 

(.80) 

(.73) 

.75 

10 

.80 

.79 

.77 

IS 

.80 

.81 

.78 

20 

.80 

.80 

.79 

25 

.80 

.82 

.79 

(25) 

(.80) 

(.80) 

.79 

4 

.75 

.63 

.69 

3 

.67 

.61 

.60 

(  3) 

(.67) 

(.59) 

.60 

6 

.67 

.62 

.63 

(  6) 

(.67) 

(.63) 

.63 

9 

.67 

.61 

.65 

12 

.67 

.64 

.65 

15 

.67 

.65 

.66 

18 

.67 

.63 

.66 

24 

.67 

.66 

.66 

5 

.60 

.53 

.36 

10 

.60 

.58 

.58 

2 

.50 

.45 

.42 

8 

.50 

.44 

.48 

(  8) 

(.50) 

(.47) 

.48 

12 

.50 

.47 

.49 

20 

.50 

.47 

.49 

5 

.40 

.36 

.38 

10 

.40 

.39 

.39 

6 

.33 

.31 

.32 

(  6) 

(.33) 

(.29) 

.32 

9 

.33 

.27 

.32 

18 

.33 

.29 

.33 

4 

.25 

.20 

.24 

5 

.20 

.21 

.20 

10 

.20 

.19 

.20 

(10) 

(.20) 

(.18) 

.20 

9 

.11 

.12 

.11 

18 

.11 

.13 

.11 

2 

0 

.16 

.17 

6 

0 

.07 

.06 

12 

0 

.06 

.03 

20 

0 

.04 

.02 

tAiabers  in  parentheeea 

are  for 

the  repeat  judgments. 

be  present  in  the  data,  to  specifying  both  the  amount  of  the  effect  and  the 
conditions  under  which  it  will  not  be  present.  Given  these  goals,  we  view  the 
results  as  supporting  our  model.  Moreover,  note  that  the  baseline  pA-model 
would  always  predict  perfect  additivity  and  thus  does  not  describe  these  data 
well. 

Individual  Analyses 

Since  each  subject  rated  all  stimuli,  we  can  fit  the  model  for  each 
person.  These  results  are  shown  in  Table  3.  The  table  indicates  substantial 

|nsert~Table”3~about~here 

individual  differences  in  the  parameter  values  and  the  degree  to  which  the 
model  fits  the  data  (as  indicated  by  the  MAD's).  When  coopered  with  the 
aggregate  analysis,  the  individual  models  contain  considerably  more  noise 
(recall  that  the  MAO  for  the  aggregate  data  is  .020).  Furthermore,  in 
coopering  each  subject's  model  against  the  baseline  pA-model,  14  of  the  32 
subjects  showed  no  significant  adjustment  process,  as  specified  by  our  model, 
while  18  did.  The  reason  for  the  emphasis  is  that  no  subject,  even  those  for 

A 

whom  Q'  ■  0,  used  a  strict  pA~strategy  (i.e.,  S(ftc)  -  p^  for  all  p^ 
and  n).  Instead,  some  used  p^  most  of  the  time  but  occasionally  adjusted 
for  n  at  pA  »  0  and  1,  while  others  had  no  clearly  discernible  strategy. 

A 

This  helps  to  explain  why  the  MAD  for  subjects  with  0"  4  .10  is  not  close  to 
zero,  as  would  be  expected  if  they  simply  used  p^  for  making  their 

A 

judgments.  Indeed,  subject  6  (0'  -  .02)  had  the  highest  MAD  of  the  32 

subjects.  Thus,  there  seem  to  be  idiosyncratic  ways  of  making  probability 
judgments  that  are  not  captured  by  equation  (8) . 

The  above  should  not  detract  from  the  fact  that  a  majority  of  subjects 
did  show  a  significant  adjustment  in  accord  with  the  theory.  We  illustrate 


this  by  the  results  of  five  subjects,  each  representing  a  different  com¬ 
bination  of  ©"  and  6  parameters.  This  is  shown  in  Table  4.  Subject 

fnsert_Table_4_about_here 

26  illustrates  the  use  of  a  highly  consistent  strategy  in  which  downward 
adjustments  are  made  over  almost  the  entire  range  of  p.  Subject  18  also  has 

A 

a  consistent  strategy  involving  adjustments,  but  p  «  .SO,  implying  that 

c 

adjustments  will  be  down  when  pA  >  .5,  up  when  pA  <  .5,  and  no  adjustments 
at  p^  ■  .5.  The  data  conform  quite  closely  to  this  pattern.  Subject  15  has 
a  somewhat  less  consistent  strategy  of  making  small  upward  adjustments  over 

A 

most  of  the  range  of  p  (p  «  .84) .  Again,  the  data  are  generally  consistent 

c 

with  this  interpretation.  Subject  3  is  included  for  contrast  since,  as  can 
be  seen,  there  was  almost  total  reliance  on  pA  (as  would  be  predicted  by  the 
parameter  values  and  low  MAO).  Subject  32  is  shown  to  illustrate  the  most 
extreme  and  least  consistent  adjustment  process  (which  was  generally 
downward).  As  is  evident  from  the  data,  this  subject  had  difficulty  in 
"controlling”  the  adjustment  process  (cf.  Hammond  s  Summers,  1972,  on 
"cognitive  control").  This  lack  of  consistency  manifested  itself  in  widely 
different  adjustments  for  the  same  stimuli  as  well  as  illogical  judgments.  An 
example  of  the  latter  was  that  evidence  of  (0:2)  was  evaluated  as  stronger 
than  (2:0)  (i.e.,  .40  vs.  .30).  The  lack  of  consistency  and  large  amount  of 
adjusting  that  characterize  subject  32  suggested  that  there  might  be  a 
positive  relation  between  the  size  of  Q'  and  MAO,  over  subjects.  When  we 
investigated  this,  the  correlation  was  r  »  .46  (p  <  .001).  Thus,  there  seems 
to  be  a  connection  between  the  amount  of  adjustment  and  the  ability  to  execute 
it  consistently. 

Our  final  results  concern  the  additivity/non-additivity  of  complementary 
probabilities  for  individual  subjects.  This  is  illustrated  using  the  subjects 
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discussed  above  and  whose  data  are  displayed  in  Table  5.  The  important  thing 

Inaert"'Table~5  "about  here 

to  note  is  that  subject  26  is  consistently  sub-additive  (and  this  is  predicted 
quite  well  by  the  model);  subject  18  is  generally  additive,  as  implied  by 

A 

p  -  .50;  subject  15  is  super-additive,  but  not  consistently  so;  subject  3  is 
c 

additive;  subject  32  is  both  highly  sub-additive  and  inconsistent.  From  our 
perspective,  these  results  strengthen  our  interpretation  of  the  0  and  6 
parameters,  as  well  as  the  general  form  of  the  model. 

A  possible  criticism  of  the  above  experiment  is  that  although  we  investi¬ 
gated  the  responses  of  32  individual  subjects  in  depth,  we  only  obtained 
responses  to  a  single  scenario.  In  other  words,  are  our  results  simply  a 
function  of  the  content  of  the  specific  scenario  investigated?  Therefore  we 
ran  another  32  subjects  using  four  different  content  scenarios  but  with  the 
same  numerical  values  as  in  the  scenario  involving  the  automobile  accident. 
These  scenarios  involved:  (1)  A  taste  test  where  people  had  to  identify  a 
soft  drink  (Coke  vs.  Pepsi);  (2)  A  bank  robbery  where  witnesses  said  the 
robbers  spoke  to  each  other  in  a  foreign  language  (German  vs.  Italian);  (3)  An 
experiment  where  6  year  old  children  had  to  identify  words  flashed  on  a  screen 
(ROT  vs.  BED);  and,  (4)  Experts  investigating  the  cause  of  a  fire  (arson  vs. 
short-circuit).  Eight  subjects  were  assigned  at  random  to  each  scenario. 

Since  the  results  from  these  four  scenarios  parallel  those  of  the  automobile- 
accident-scenario  in  terms  of  model  fits  (albeit  with  different  parameter 
values),  they  are  not  presented  here. 

Experiment  2 

We  had  two  goals  in  conducting  experiment  2.  First,  we  wished  to  test 
systematically  for  the  effects  of  source  credibility  and  signal  dissimilarity 
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TABLE  5 


Additivity/Non-additivity  of  Complementary  Probabilities 
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on  the  parameters  of  the  nodal.  In  accordance  with  our  theory,  6"  should 
decrease  as  source  credibility  and  signal  dissimilarity  increase.  Second,  we 
wished  to  investigate  the  importance  of  individual  differences  in  the  way 
people  cope  with  the  ambiguity  inherent  in  our  judgment  task. 

METHOD 

Design.  Two  levels  (high/low)  of  source  credibility  and  dissimilarity  of 
signals  were  crossed  in  a  2  x  2  factorial  design.  In  addition,  four  different 
content  scenarios  were  constructed  that  varied  on  all  four  experimental 
combinations  (resulting  in  16  different  stories).  Subjects  were  asked  to 
judge  21  stimuli  that  varied  in  p  and  n  (see  below)  for  each  of  the  four 
content-distinct  scenarios.  Thus,  each  subject  initially  made  84  probability 
judgments.  However,  to  reduce  boredom  in  the  task,  subjects  made  judgments  in 
all  four  scenarios,  with  each  scenario  representing  one  of  the  four  experi¬ 
mental  conditions.  For  example,  subject  1  received  scenario  A  in  the 
high/high  condition,  scenario  B  in  the  high/low  condition,  and  so  on.  A  four- 
person  latin-square  was  set  up  so  that  every  scenario  appeared  an  equal  number 
of  times  in  each  experimental  condition.  Finally,  since  subjects  made  judg¬ 
ments  in  one  scenario  under  the  high/high  condition,  the  same  scenario  was 
also  given  in  the  low/low  condition  (and  the  order  was  counter-balanced).  In 
this  way,  we  were  able  to  examine  each  subject's  judgments  holding  the  content 
of  the  scenario  constant.  This  part  of  the  experiment  required  21  additional 
judgments,  making  the  total  number  of  responses  for  each  subject  equal  to  105. 

Stimuli.  The  four  content  scenarios  used  involved  the  automobile 
accident  from  experiment  1,  the  word -recognition  task  described  above  and  two 
new  stories.  These  latter  scenarios  involved  determining  the  name  of  a  play 
from  an  excerpt,  and  the  diagnosis  of  a  medical  condition.  Four  versions  of 


•ach  scenario  ware  constructed  to  reflect  different  levels  of  credibility  and 
dissiailarity  (e.g. ,  in  the  word-recognition  task,  we  had  15  vs.  6  year  olds 
and  BED  vs.  ROT  as  opposed  to  BED  vs.  BID).  Within  each  scenario,  subjects 
were  given  21  stimuli  that  reflected  the  amount  of  evidence  for  each 
hypothesis.  Die  values  of  the  stimuli  were  different  from  those  used  in 
experiment  i  in  that  smaller  values  of  n  were  used  in  order  to  provide  more 
sensitive  tests  of  the  model.  Die  stimuli  used  were:  for  p  ■  0,  1,  n  -  1,  2, 
6;  for  p  -  .125,  .875,  n  ■  8;  for  p  -  .2,  .8,  n  ■  5;  for  p  *  .25,  .75,  n  ■  4; 
for  p  -  .33,  .67,  n  »  6,  9;  for  p  -  .67,  n  -  3;  for  p  »  .4,  .6,  n  -  5;  for  p  « 
.5,  n  —  2,  8. 

Subjects  and  Procedures.  Diirty-two  subjects  participated  in  this 
eiqperiment  (comprising  8,  4-person  latin-squares) .  Subjects  were  paid  $5  per 
hour  and  the  task  took  about  one  hour  to  complete.  Die  tasks  were  presented 
in  booklets  and  after  each  series  of  21  judgments,  subjects  were  either  given 
a  break  or  another  task.  At  the  end  of  the  experiment,  a  manipulation  check 
was  performed  on  the  credibility  and  dissimilarity  induction.  Specifically, 
each  subject  was  asked  to  rate  (using  a  0-100  scale)  the  credibility  of  the 
source  and  the  confusability  of  the  signals  in  all  four  scenarios,  since  each 
scenario  had  high  and  low  levels  of  each  factor,  the  subjects  rated  credibil¬ 
ity  and  dissimilarity  under  both  conditions.  Dierefore,  subjects  made  4 
judgments  on  each  of  the  4  scenarios. 

Basalts 

Before  presenting  the  main  results,  we  note  that  the  manipulation  check 
showed  that  subjects  did,  on  average,  see  the  "high"  credibility  versions  of 
the  same  scenarios  as  greater  than  the  low  (80  vs.  47);  and  the  high  dis¬ 
similarity  signals  as  less  confusable  than  the  low  (30  vs.  62). 


30 


(1)  General  fit  of  the  nodal:  For  each  subject  In  each  experimental 
condition,  the  model  was  fit  to  yield  estimates  of  9'  and  £  (this  resulted 
in  160  models  -  32  subjects  x  5  models).  The  fit  of  the  individual  models  was 
comparable  to  that  of  experiment  1  (median  MAO  •  .042  over  all  conditions). 

(2)  Manipulation  of  9'*.  The  appropriate  analysis -of -variance  (2x2 

as 

x  latin-square )  was  performed  using  0 '  as  the  dependent  variable  and  the 
results  showed  a  significant  main  effect  for  "credibility"  (p  <  .001),  no 
main  effect  for  "dissimilarity,"  and  a  three-way  interaction  of  scenario  x 
credibility  x  dissimilarity  (p  <  .02).  The  results  for  the  main  effect  are 
shown  in  Table  6.  The  table  shows  that  9"  does  increase  as  the  credibility 
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of  the  source  decreases,  thereby  confirming  our  prediction.  However,  there 
was  no  effect  for  dissimilarity,  contrary  to  our  prediction.  The  three-way 
interaction  showed  that  in  two  scenarios,  the  effect  of  dissimilarity  of  the 
signals  had  a  large  effect  on  0“  when  credibility  was  low,  while  in  the 
other  two  scenarios,  dissimilarity  had  a  large  effect  when  credibility  was 
high.  However,  it  is  not  clear  why  this  occurred  and  we  do  not  consider  it 
further. 

In  addition  to  the  above  analysis,  recall  that  each  subject  also  received 
the  same  scenario  in  the  high/high  and  low/low  conditions.  A  comparison  of 
the  means  of  the  estimated  0''s  in  these  two  conditions  also  showed  a  sig¬ 
nificant  difference  in  the  hypothesized  direction;  i.e..  S'  -  .17  in  the 
high/high  condition.  S'  »  .29  in  the  low/low  (p  <  .004  by  a  paired  t-test). 
Thus,  with  the  exception  of  an  effect  for  the  dissimilarity  of  the  signals, 
our  hypotheses  concerning  0'  are  supported  by  the  experimental  data. 

(3)  Individual  differences:  We  now  consider  the  following:  (a)  can 
subjects  be  characterized  as  having  a  general  strategy,  as  measured  by  the 


consistency  of  their  0'  and  8  values,  in  different  scenarios?;  (b)  is  the 
amount  of  one's  adjustment,  as  measured  by  Q' ,  systematically  related  to  the 
consistency  of  executing  one's  strategy?;  (c)  can  individual  perceptions  of 
the  credibility  of  the  source  and  the  dissimilarity  of  the  signals  account  for 
variance  in  0 *  and  8  within  each  of  the  experimental  conditions? 

(a)  Recall  that  for  each  subject,  four  different  scenarios  were  given  and 

a  model  fit  to  the  data  in  each.  Therefore,  each  subject  can  be  characterized 
by  four  Q'*s,  8's,  and  MAD's.  To  determine  if  the  parameter  values  were 

more  alike  within  a  subject  than  between  subjects  (this  is  measured  by  the 
intra -class  correlation),  a  one-way  repeated  analysis-of-variance  was  per- 

a  a 

formed  (32  x  4)  for  Q' ,  p  ,  and  HAD  (Winer,  1963,  chap.  3).  The  results 

c 

A  A 

showed  that  for  Q',  r  *  .73  (p  <  .001);  for  p  ,  r  -  .68  (p  <  .001);  and 

c 

for  MAO,  r  -  .86  (p  <  .001).  lhese  results  are  particularly  striking  when  one 

recalls  that  the  four  scenarios  varied  over  the  four  experimental  conditions. 

However,  in  spite  of  these  differences,  the  results  show  strong  and  stable 

individual  strategies  in  the  amount  that  is  adjusted  (0'),  the  direction 

of  the  adjustments  (p  or  8),  and  the  consistency  of  executing  one's 

c 

strategy  (MAO) . 

(b)  In  experiment  1,  we  found  a  significant  positive  correlation  between 

A 

0"  and  MAO.  The  same  positive  relation  was  found  here  in  three  of  the  four 
scenarios  (r  -  .67,  .48,  .40,  .10).  Thus,  our  Interpretation  of  0  as 
reflecting  a  cognitive  simulation  process  is  strengthened  by  the  generality  of 
this  finding. 

(c)  Since  each  subject  made  independent  judgments  of  the  credibility  and 
confusability  of  the  experimental  stimuli,  we  wei*  also  able  to  investigate  how 
these  judgments  related  to  0*  within  experimental  conditions.  To  do  so,  we 
re-analyzed  our  data  with  a  regression  model  where  0'  was  the  dependent 


variable,  and  tha  individual  ratings  of  credibility  and  confusability,  together 
with  dummy  variables  representing  the  different  scenarios,  were  the  independent 
variables.  More  precisely,  there  is  a  regression  equation  of  this  type  for  each 
of  the  four  experimental  conditions.  However,  these  four  equations  can  be 
estimated  more  efficiently  as  a  single  model  using  Zellner's  (1962)  procedure 
for  "seemingly  unrelated"  regressions.  The  multiple  R  estimated  by  this 
procedure  was  .44  (with  an  adjusted  R  of  .35).  Of  the  independent  variables, 
there  was  no  effect  for  either  scenarios  or  confusability.  However,  all  four 
coefficients  for  credibility  in  the  different  experimental  conditions  were 
significant  (p  <  .02)  and  of  the  hypothesized  sign  (i.e.,  a  negative  relation 
between  0"  and  ratings  of  credibility).  We  interpret  these  results  as 
strengthening  the  conclusions  drawn  from  the  more  standard  AN OVA  of  our  study; 
that  is,  0"  is  not  only  affected  by  different  levels  of  credibility  across 
all  subjects,  it  also  covaries  significantly  with  individual  perceptions  of 
credibility  within  each  of  these  levels. 

Experiment  3 

The  purpose  of  this  experiment  was  to  answer  the  following  question:  Can 
individuals'  choices  between  gambles  be  predicted  from  knowledge  of  their  0" 
and  6  parameters  obtained  from  a  separate  inference  task?  To  examine  this, 
subjects  were  first  asked  to  make  judgments  as  in  experiments  1-2  and  both 
9'  and  0  were  estimated  as  before.  The  subjects  were  then  asked  to  choose 
(or  express  indifference)  between  9  pairs  of  gambles  involving  the  outcome  from 
an  urn  with  known  probability  versus  the  occurrence  of  an  event  on  the  basis  of 
unreliable  reports.  If  9'  and  0  do  capture  aspects  of  ambiguity  that  affect 
choice,  knowledge  of  these  parameters  should  allow  one  to  predict  individual 


choices  in  addition  to  inferences 


Subjects.  Twenty  subjects,  recruited  from  the  University  of  Chicago 
community,  participated  in  this  study,  they  were  paid  $  5/hour. 

Stimuli .  For  the  inference  task,  two  different  scenarios  were  used: 
the  automobile-accident  story,  and  the  taste-test  story  (Pepsi  vs.  Coke)  for 
which  we  had  also  previously  collected  data  (see  end  of  experiment  1).  These 
were  chosen  because  the  0"  and  5  values  were  quite  different  in  the  two 
cues.  In  both  scenarios,  subjects  received  40  combinations  of  p  and  n 
that  were  identical  to  those  used  in  experiment  1 .  The  stimuli  for  the  choice 
task  involved  one  of  the  following:  (a)  In  the  automobile-accident  task, 
subjects  were  faced  with  choosing  between  betting  that  a  ball  drawn  from  an 
urn  with  known  probability  was  green,  versus,  betting  that  the  car  that  caused 
the  accident  was  green  based  on  witnesses'  reports  of  the  car  color;  (b)  For 
those  in  the  taste-test  scenario,  the  choice  was  similarly  between  betting 
that  the  outcome  from  an  urn  was  a  certain  color,  versus,  betting  that  the 
drink  was  Pepsi-Cola.  In  both  scenarios,  subjects  were  told  to  imagine  that 
their  payoff  for  being  correct  would  be  $10.  Thus,  the  payoffs  for  the  urn 
gamble  and  the  bet  involving  the  report  of  some  event  were  equal.  Within 
scenarios,  each  subject  saw  9  pairs  of  gambles  that  varied  in  the  proportion 
of  colored  balls  in  the  urn  and  the  proportion  of  reports  favoring  the 
particular  hypothesis.  These  proportions  were  always  the  same  in  the  two 
bets.  The  exact  values  of  p  used  in  the  9  pairs  were:  1,  .875,  .75,  .625, 
.50,  .375,  .25,  .125,  and  0.  The  number  of  balls  in  the  urn  and  the  number  of 
reports  were  held  constant  at  8. 

Procedure.  The  20  subjects  were  randomly  assigned  to  one  of  the  two 
scenarios.  The  procedure  for  the  inference  task  was  identical  to  the  previous 
experiments.  After  subjects  finished  the  inference  task,  they  were  presented 
with  the  appropriate  choice  task.  The  nature  of  the  two  gambles  was 


explained,  and  subjects  wars  than  asked  to  choose,  or  indicate  indifference, 
between  the  gambles.  If  they  were  not  indifferent,  they  were  also  asked  to 
indicate  their  strength  of  preference  on  a  4-point  scale  (fron  "little"  to 
"great  deal").  After  doing  this  for  one  value  of  p,  they  turned  the  page 
and  made  another  choice  (and  strength  of  preference  rating,  if  appropriate)  at 
the  next  level  of  p.  This  continued  until  all  9  pairs  had  been  considered, 
therefore,  for  each  subject,  there  were  9  choices  between  an  unambiguous  bet 
from  an  urn  with  known  p,  versus  an  ambiguous  bet  that  an  event  occurred,  on 
the  basis  of  the  proportion  of  favorable  reports  from  an  unreliable  source. 

Results.  Since  each  subject  first  participated  in  the  inference  task, 
we  briefly  consider  these  results  before  discussing  the  choice  data.  As 
expected,  there  were  marked  differences  in  the  8"  and  B  parameters  in  the 
two  scenarios.  The  medians  for  Q'  and  pe  (implied  by  B)  were  .13  and 
.11,  respectively,  in  the  automobile-accident  scenario.  For  the  taste-test 
story,  the  median  9'  was  1.35  and  median  pc  -  .45.  Thus,  the  taste-test 
scenario  induced  much  adjustment,  with  a  cross-over  point  near  .50,  while  the 
automobile-accident  story  induced  less  adjustment  but  a  lower  cross-over 
point. 

To  compare  each  subject's  choices  with  predictions  from  the  inference 
model,  the  following  procedure  was  used:  any  combination  of  9'  and  pc 
implies  when  and  where  S(p^)  is  greater,  less  than,  or  equal  to,  pA  (see 
equation  (8)).  Thus,  for  each  subject,  when  p^  >  S(pA),  we  predicted  the 
urn  would  be  chosen  over  the  bet  based  on  unreliable  reports;  when  S(pA)  > 

PA,  the  opposite  prediction  was  made;  when  S(pA)  -  p^,  we  predicted 
indifference  between  the  two  gambles.  Note  that  when  9'  -  0,  we  always 
predicted  indifference  between  the  gambles  since  S(pA)  -  pA  for  all  pA» 

A  A 

In  Table  7,  we  shw  the  9“  and  p  values  for  each  subject  (grouped  by 

c 
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scenario),  and  the  number  of  correct  choice  predictions  by  subject. 

A 

To  evaluate  how  well  the  choices  were  predicted  from  knowledge  of  0" 

A 

and  p  ,  we  used  a  random  baseline  for  comparison;  i.e.,  for  each  of  the  9 
c 

choices  made  by  a  subject,  there  were  three  possible  outcomes;  urn,  report, 
or  indifference.  Since  the  probability  of  randomly  predicting  the  correct 
response  is  1/3,  we  cosputed  the  probability  of  getting  at  least  r  hits  in  9 
trials  on  the  basis  of  chance  (using  the  binomial  distribution).  This  prob¬ 
ability  is  shown  in  the  last  column  of  Table  7.  For  example,  subject  1  was 
correctly  predicted  in  8  of  the  9  choices;  the  probability  of  getting  at  least 
this  many  hits  by  chance  is  .001.  Thus,  we  rejected  the  hypothesis  that  our 
predictions  for  this  subject  were  no  better  than  chance.  Osing  this  method 
for  all  subjects,  it  can  be  seen  that  5  of  the  10  subjects  in  the  automobile- 
accident  scenario,  and  4  of  10  in  the  taste-test,  are  well  predicted  using  a 
type  X  error  level  of  .05.  If  this  error  level  were  increased  to  .15,  a 
majority  of  subjects  (12/20)  would  be  accurately  predicted  from  their  infer¬ 
ence  parameters.  Xn  any  event,  at  the  aggregate  level  (over  subjects  and 
scenarios),  there  were  103  hits  out  of  179  predictions  (one  response  was 
missing) .  The  probability  of  getting  at  least  this  many  hits  by  chance  is 
miniscule. 

Second,  consider  the  results  concerning  the  strength  of  preference 

/ 

ratings.  Recall  that  in  addition  to  choosing  between  gambles,  subjects  were 


asked  to  rate  their  strength  of  preference  on  a  4-point  scale.  These  ratings 
supplement  our  analysis  of  the  choice  data  in  the  following  way:  in  each 
scenario,  the  number  of  prediction  errors  was  38.  However,  in  the  taste-test. 


between  S(pA)  and  should  b«  larger  In  the  taate-taat  than  in  the 

accident  story.  Furthermore,  the  larger  the  differences,  the  stronger  one's 
preferences  should  be  since  they  are  further  away  from  indifference  (where 
PA  •  S(pA) ).  We  tested  this  by  coopering  the  mean  strength-of -preference 
ratings  in  the  tw  stories  across  the  nine  levels  of  p.  these  results  are 
shown  in  Table  8.  First,  note  that  the  neans  for  the  taste-test  are  larger 

"insert"  Table-  ~8~  about*  Tiara* 

than  the  automobile-accident  at  every  level  of  p.  Second,  the  pattern  of 
sMans  is  consistent  with  the  general  fora  of  the  model  in  that  preferences  are 
strongest  at  p  »  1,  decrease  as  p  approaches  pe,  and  then  increase  again 
at  p  -  0.  Therefore,  the  strength-of-preference  data  are  consistent  with 
both  the  difference  in  the  sizes  of  9'  for  the  two  scenarios,  as  well  as  the 
general  fora  of  the  model. 

As  the  astute  reader  may  have  noticed,  our  theory  does  not  necessarily 
imply  exact  equivalence  between  choice  and  inference  tasks  since  these  could 
differ  with  respect  to  the  0  paraaeter.  in  particular,  while  payoffs  are 
ejqplicit  in  the  choice  task  (i.e. ,  a  gain  of  910),  there  are  no  explicit 
payoffs  in  the  inference  task.  Thus,  one  night  expect  a  systematic  bias 
between  0  as  estimated  in  the  inference  task,  and  0  as  implied  by 
subjects'  choices.  Specifically,  as  stated  after  first  presenting  our  model, 
if  the  effect  of  ambiguity  is  to  induce  caution  rather  than  riskiness,  then 
the  prospect  of  a  gain  would  focus  attention  more  on  smaller  rather  than 

e 

larger  values  of  p(Gain)  such  that  ®choice  <  ^inference*  (Conversely,  the 

prospect  of  a  loss  would  imply  more  attention  being  paid  to  greater  rather 

smaller  values  of  p(Loss)  such  that  0  .  .  >  0.  .  ).  Consequently, 

cnoiet  insttinct 

one  would  expect  ambiguity  avoidance  over  a  wider  range  of  p  in  tasks 
involving  choice  as  opposed  to  inference,  indeed,  some  of  the  errors  in 
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predicting  subject* •  choices  can  be  attributed  to  precisely  this  source  of 
systematic  bias.  Consider  the  data  for  the  automobile-accident  scenario  in 

A 

Table  7.  Bight  of  the  10  subjects  had  low  p^  values  in  the  inference  task 

and  thus  could  be  thought  of  as  having  already  conceptualized  the  ta<  k  in 

terms  of  "gains.”  On  the  other  hand,  for  the  two  subjects  with  high  p  *s  in 

c 

the  inference  task  (numbers  6  and  10),  all  7  prediction  errors  out  of  18 

A 

choices  were  in  the  same  direction,  namely  subjects*  0's  estimated  in  the 

A 

inference  task  indicated  larger  Pc'a  than  were  revealed  by  their  choices. 

The  same  bias  was  also  found  in  the  taste-test  scenario,  that  is,  consider 

A 

again  only  those  subjects  with  high  p  *s  (numbers  13  through  18).  For  4  of 

c 

these  subjects,  all  15  out  of  15  prediction  errors  are  consistent  with  the 

^choice  <  ^inference  biaa*  two  prediction  errors  of  one  subject  (number 

16)  are  in  the  opposite  direction,  and  the  8  prediction  errors  of  subject  14 
are  equally  distributed  in  both  directions.  To  summarize,  we  conclude  that 
whereas  individuals'  parameters  in  an  inference  task  can  be  used  to  predict 
choices,  many  errors  of  prediction  are  in  accord  with  a  systematic  bias  in  the 
6  parameter  that  is  consistent  with  our  theory. 

Ksperiawnt  4 
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Having  manipulated  the  0  parameter  in  experiment  2,  the  purpose  of 
experiment  4  was  to  investigate  the  effects  of  manipulating  0.  This  was  done 
by  allocating  subjects  to  different  roles  (sellers  and  buyers)  in  an  insurance 
context.  The  dependent  variable  of  interest  involved  statements  of  maximum 
buying  prices  and  minimum  selling  prices.  The  data  were  collected  as  part  of 
a  larger  investigation  by  Hogarth  and  Kunreuther  (1984)  on  the  effects  of 
ambiguity  in  insurance  decision  making. 

The  assumption  underlying  the  experimental  manipulation  is  that  a  person 
who  assumes  a  risk,  is  likely  to  pay  more  attention  to  larger  values  of 
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p(Loss)  than  someone  who  transfers  tha  risk.  Experimental  evidence  consistent 
with  this  assertion  has  been  documented  by  Her they,  Kunreuther,  and  Schoeoaker 
(1982)  and  Thaler  (1980).  In  our  framework,  it  implies  that  >  ®buyer* 

Given  this  assumption,  approximate  ambiguity  functions  for  buyers  and  sellers 
of  insurance  can  be  sketched  as  in  Figure  4.  Mote  that  When  buyers/sellers  in 


a  non- ambiguous  situation,  S(pA)  -  pA  and  all  responses  are  on  the  diagonal. 

If  one  further  assumes  that  buying  and  selling  prices  are  monotonically 
related  to  S(pA),  Figure  4  suggests  the  following  predictions:  (1)  When 
buyers  and  sellers  are  in  a  non-ambiguous  situation,  S(pA)  *  p^»  and  the 
seller's  price  should  equal  the  buyer's;  (2)  when  buyers  and  sellers  are 
equally  ambiguous  (i.e.,  their  8's  are  equal),  the  seller's  price  should 
exceed  the  buyer's  over  the  whole  range  of  p^.  Mote  that  this  arises  because 
the  seller  always  weights  imaginary  values  of  p(loss)  larger  than  the 
initial  estimate  more  than  the  buyer.  (3)  Consider  a  seller  who  has  no 
ambiguity  about  the  probability  of  a  loss,  but  a  buyer  who  does.  In  Figure  4, 
this  is  shown  by  comparing  the  buyer-ambiguous  function  with  the  diagonal 
(seller-unambiguous).  Mote  that  the  buyer’s  function  is  above  the  diagonal 
for  p^  <  pc.  This  means  that  the  buyer  will  perceive  the  probability  of  loss 
as  higher  than  the  seller,  and  should  be  willing  to  pay  more  than  the  seller 
would  ask.  However,  when  p^  >  pc,  the  buyer  will  perceive  the  loss  prob¬ 
ability  as  lower  than  the  seller,  and  offer  less  than  the  seller  would  ask. 
This  implication  of  the  model  provides  a  particularly  stringent  test  for  our 
theory.  Experiment  4  was  designed  to  test  the  above  three  predictions. 
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Figure  4.  Approximate  ambiguity  functions  for 
buyers  and  sellers  of  insurance. 


METHOD 


Design.  Prices  for  insurance  contingent  on  ambiguous  and  non-ambiguous 
probabilities  were  investigated  across  four  different  probability  levels  '.01, 
.35,  .65,  and  .90).  Each  subject  was  assigned  the  role  of  buyer  or  seller  of 
a  contract  concerning  a  potential  $100,000  loss  and  responded  to  both  ambig¬ 
uous  and  non-ambiguous  versions  of  the  stimulus  at  one  probability  level. 

Thus,  the  design  of  the  ejqser intent  involved  three  factors,  two  of  which  ware 
between  subjects  (i.e. ,  role  of  buyer  or  seller,  and  probability  level)  and 
one  within  subjects  (i.e.,  ambiguous  vs.  non-ambiguous  probabilities). 

Stimuli.  The  scenario  used  in  the  stimulus  material  involved  the  owner 
of  a  small  business  (net  assets  of  $110,000)  who  was  seeking  to  insure  against 
a  $100,000  loss  that  could  result  from  claims  concerning  a  defective  product. 
Subjects  assigned  the  role  of  buyers  were  told  to  imagine  that  they  were  the 
owner  of  the  business.  Subjects  assigned  the  role  of  sellers  were  asked  to 
imagine  that  they  headed  a  department  in  a  large  insurance  company  and  were 
authorized  to  set  premiums  for  the  level  of  risk  involved.  Ambiguity  was 
manipulated  by  factors  involving  haw  well  the  manufacturing  process  was 
understood,  whether  the  reliabilities  of  machines  used  in  the  process  were 
known,  and  the  extent  to  which  manufacturing  records  were  well  kept.  In  both 
ambiguous  and  non-ambiguous  cases  a  specific  probability  level  was  stated 
(e.g.,  .01);  however  a  comment  was  also  added  as  to  whether  one  could  "feel 
confident"  (non-ambiguous  case)  or  "experience  considerable  uncertainty" 
(ambiguous  case)  concerning  the  estimate.  As  far  as  possible,  the  same 
wording  was  used  in  both  the  buyer  and  seller  versions  so  that  perceptions  of 
•ebiguity  would  be  uniform  in  the  two  cases. 

Subjects  and  procedures.  Subjects  were  111  MBA  students  at  the 
University  of  Chicago  who  responded  to  questionnaires  distributed  in  a  course 


on  decision  making.  To  avoid  prior  influence,  the  experiment  took  place 
during  the  beginning  of  classes.  Subjects  were  asked  to  respond  to  the 
questionnaire  in  anonymous  fashion  and  promised  group-level  feedback  at  a 
later  class  session  (which  they  subsequently  received).  It  is  important  to 
note  that  subjects  had  prior  training  in  business,  economics,  and  statistics, 
and  the  insurance  context  was  familiar  to  them.  Bight  different  forms  of  the 
stimulus  materials,  corresponding  to  the  2  (roles)  x  4  (probability  levels), 
were  shuffled  and  distributed  in  the  classrooms  thereby  ensuring  random 
allocation  of  subjects  to  conditions.  After  reading  each  stimulus,  subjects 
were  asked  to  state  maximum  buying  prices  (for  buyers)  or  minimum  selling 
prices  (for  sellers). 

Results.  Table  9  reports  medians  for  all  experimental  conditions  as  well 
as  the  differences  between  the  sellers  and  buyers  for  the  ambiguous  and  non- 
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ambiguous  cases  respectively.  We  report  medians  since  several  distributions 
within  conditions  are  quite  skewed,  and  variances  also  differ  between  cells  at 
the  same  probability  levels,  often  significantly.  The  pattern  of  results  in 
Table  9  supports  our  three  predictions.  First,  in  comparing  buyers  and 
sellers  in  the  non-ambiguous  case,  note  that  the  median  prices  are  quite 
similar  over  the  four  probability  levels.  Second,  when  buyers  and  sellers  are 
both  ambiguous,  observe  that  the  selling  price  is  considerably  larger  than  the 
buying  price  at  every  level  of  p.  This  result  strongly  confirms  the  notion 
that  ®8eller  >  ®buyer  wl*®n  considering  ambiguous  loss  probabilities.  Third, 
consider  the  ambiguous  buyer  and  the  non-ambiguous  seller.  As  expected, 
when  p  is  small  (.01),  the  ambiguous  buyer  is  willing  to  pay  more  ($1,500) 
than  the  non-ambiguous  seller  asks  ($1,000).  However,  as  the  probability  of 
loss  increases,  the  two  prices  converge  (at  p  »  .35),  and  then  diverge,  with 
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the  buyer's  price  bein'*  less  than  the  seller's  (at  p  -  .65/  .90).  Indeed,  at 
higher  probabilities,  the  buyer  is  willing  to  spend  considerably  less  than  the 
seller  wants  to  charge,  therefore,  although  the  buyers  seen  to  be  ambiguity 
avoiding  at  low  probabilities,  they  paradoxically  appear  to  be  aabiguity 
seeking  for  high  loss  probabilities.  Both  results  are  predicted  by  our 
nodel.  In  Hogarth  and  Kunreuther  (1984),  the  results  of  several  other  related 
experiments  are  reported  using  different  scenarios,  research  designs, 
subjects,  and  response  modes,  the  results  of  these  experiments  are  consistent 
with  those  reported  here,  thus  attesting  to  the  stability  of  the  phenomena. 

DISCUSSION 

He  now  discuss  the  implications  of  our  theory  and  results  with  respect  to 
the  following  issues:  (1)  the  importance  of  ambiguity  in  assessing  uncer¬ 
tainty;  (2)  the  use  of  cognitive  strategies  in  probabilistic  judgments  under 
ambiguity;  (3)  the  role  of  ambiguity  in  risky  choice;  and,  (4)  extensions  of 
the  model  to  multiple  sources  and  time  periods. 

Ambiguity  and  the  assessment  of  Uncertainty 

The  concept  of  ambiguity  highlights  the  distinction  between  one's  lack  of 
knowledge  of  the  process  that  generates  outcomes  and  the  uncertainty  of 
outcomes  conditional  on  some  model  of  the  process.  The  fact  that  there  are  at 
least  two  sources  of  uncertainty  in  most  situations  leads  to  the  irony  that 
one  needs  a  well-defined  model  to  give  precise  estimates  of  how  much  one 
doesn't  know.  Indeed,  the  usefulness  of  formulating  well-defined  stochastic 
processes  is  in  eliminating  ambiguity  so  that  outcome  uncertainty  can  be 
quantified.  Thus,  when  coins  are  "fair"  or  random  drawings  are  taken  from 
urns  with  known  p,  there  is  no  second-order  uncertainty.  Furthermore,  the 
conditional  nature  of  uncertainty  is  implicitly  recognized  in  various  attempts 
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to  quantify  and  isprove  inferential  judgments.  For  example,  consider  how 
uncertainty  is  defined  in  the  "lens  model”  (Hammond,  et  al. ,  1964).  Zn  this 
case,  the  uncertainty  in  the  environment  is  measured  as  the  residual  variance 
not  accounted  for  by  a  well-formulated  ecological  model.  Thus,  unexplained 
variance  or  uncertainty  is  conditional  on  the  model  of  how  particular  cues 
combine  to  form  the  criterion  of  interest.  Now  consider  the  work  of  Nisbett 
and  colleagues  on  trying  to  improve  probabilistic  judgments  through  training 
(Nisbett,  et  al. ,  1983;  Jepson,  et  al. ,  1983).  They  argue  that  training  and 
experience  can  allow  one  to  see  the  underlying  structure  of  real-world 
problems  so  that  the  appropriate  model  can  be  used  for  making  better  judg¬ 
ments.  Thus,  the  focus  of  their  training  is  on  making  various  statistical 
principles  (e.g. ,  regression-to-the-mean,  law  of  large  numbers,  use  of  base 
rates,  etc.)  more  obvious  in  everyday  inferences. 

While  the  conditional  nature  of  uncertainty  has  been  implicitly 
recognized,  ambiguity  results  from  its  explicit  recognition;  i.e.,  by 
realizing  that  the  "model"  is  itself  subject  to  uncertainty.  Indeed,  one 
could  argue  that  the  cost  of  urn  models,  coin-flipping  analogies,  and  the 
like,  is  that  they  obscure  the  fact  that  most  real  world  generating  processes 
are  not  precisely  known.  Furthermore,  even  if  a  process  is  well-defined  at 
one  point  in  time,  the  parameter(s)  of  the  process  can  change  over  time, 
resulting  in  ambiguity  as  well  as  uncertainty.  For  example,  imagine  that  you 
have  been  asked  to  evaluate  the  research  output  of  a  younger  colleague  being 
considered  for  promotion.  Your  colleague  has  produced  11  papers;  of  these  the 
first  9  (in  chronological  order)  represent  competent,  albeit  unexciting 
scholarly  work.  On  the  other  hand,  the  latter  2  papers  are  quite  different; 
they  are  innovative  and  suggest  a  creativity  and  depth  of  thought  absent  from 
the  earlier  work.  What  should  you  do?  Aa  someone  who  is  aware  of  regression 
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fallacies,  you  might  consider  the  two  outstanding  papers  as  outliers  from  a 
stable  generating  process  and  thus  predict  regression-to-the-mean. 
Alternatively,  you  might  consider  the  outstanding  papers  as  "extreme” 
responses  that  signal  a  change  in  the  generating  process;  i.e. ,  a  new  and 
higher  mean.  If  this  were  the  case,  the  same  general  regression  model  would 
predict  future  papers  of  high  quality  (regression  to  a  higher  mean).  If  one 
asks  what  is  the  nature  of  the  signaling  in  this  case,  it  is  obvious  that  the 
chronological  order  of  the  papers  is  crucial.  Indeed,  imagine  that  the 
outstanding  papers  were  the  first  two  that  were  written;  or  consider  that  they 
were  the  second  and  sixth.  Each  of  these  cases  suggests  a  different  under¬ 
lying  model  and  perhaps  a  different  prediction.  In  any  event,  the  uncertainty 
associated  with  any  prediction  is  complicated  by  the  ambiguity  regarding  the 
appropriate  mean  of  the  regression  process. 

Cognitive  strategies  in  Inferences  Under  Ambiguity 

We  have  assumed  that  people  use  an  anchoring-and-adjustment  strategy  in 
making  inferences  under  ambiguity.  However,  whereas  the  term,  "anchoring-and- 
adjustment”  is  quite  general  and  could  encompass  a  wide  range  of  models  (cf . 
Lopes,  1981;  Einhorn  &  Hogarth,  1984),  we  have  been  quite  specific  as  to  the 
nature  of  this  process  in  our  tasks.  Of  greatest  interest  in  this  regard  is 
the  idea  that  adjustments  are  based  on  a  mental  simulation  in  which  "what 
might  be,"  or,  "what  might  have  been,"  is  combined  with  "what  is"  (the 
anchor).  The  rationale  for  this  comes  from  the  fact  that  the  evaluation  of 
evidence  often  involves  an  implicit  comparison  process  (similar  to  the 
perception  of  figure  against  ground).  Thus,  when  evaluating  the  strength  of 
evidence  for  a  particular  hypothesis,  the  evidence  that  might  have  been  can 
serve  as  a  convenient  contrast  case  for  comparison.  Furthermore,  since 
ambiguity  implies  that  multiple  models  could  have  produced  the  observed 


results,  it  seems  natural  to  consider  that  different  results  could  have 
occurred  on  the  basis  of  different  underlying  processes. 

The  support  for  the  hypothesized  anchoring-and -adjustment  strategy 
coees  f roe  several  sources.  First,  recall  that  in  our  model,  the  largest 
adjustments  to  the  anchor  occur  when  evidence  is  meager.  Moreover,  as  n 
increases,  S(f:c)  asymptotes  at  pA.  The  results  of  experiments  1  and  2 
support  this  prediction,  ttius,  the  weight  of  evidence  (to  use  Keynes'  term) 
for  "what  is,"  dominates  "what  might  have  been"  as  the  absolute  amount  of 
evidence  increases.  Furthermore,  the  effect  of  increasing  n  is  to  reduce 
the  amount  of  non-additivity  of  complementary  strengths.  Since  most  of  our 
subjects  were  sub-additive,  our  model  provides  a  psychological  link  to 
concerns  stressed  by  others  regarding  the  appropriateness  of  additivity  when 
evidence  is  meager  (Shafer,  1976;  Cohen,  1977).  In  particular,  Cohen  (1977, 
chap.  3)  points  out  that  when  one  considers  an  incomplete  system,  the  lower 
benchmark  on  provability  is  not  necessarily  disprovability,  but  nonprov¬ 
ability.  For  example,  if  one  has  meager  circumstantial  evidence  such  that  the 
probability  of  the  truth  of  a  particular  theory  is  .2,  does  this  imply  that 
the  theory  is  false  with  p  »  .87  Rather,  one  might  say  that  the  theory  is 
not  proven  (in  a  probabilistic  sense)  as  opposed  to  saying  that  there  is  a  .80 
chance  that  it  is  wrong.  Furthermore,  the  idea  that  the  complement  of 
statements  can  lead  to  "not-proved"  rather  than  "disproved,”  seems  to  be 
dseply  imbedded  in  the  Anglo-American  legal  system.  Indeed,  in  Scottish  law, 
defendants  are  either  found  guilty,  not-guilty,  or  "not  proven.”  The  last 
category  is  reserved  for  those  eases  where  the  evidence  is  too  meager  to  allow 
for  a  judgment  of  guilt  or  innocence. 

Second,  the  fact  that  non-additivity  results  from  a  shift  in  the 
direction  of  the  adjustment  process  is  consistent  with  other  "order  effects" 


due  to  the  use  of  anchoring-and-adjustaent  strategies.  For  example,  in  a 
traditional  Bayesian  revision  task,  Lopes  (1981)  found  that  a  change  in  the 
order  in  which  sample  information  was  presented  affected  overall  judgments  by 
changing  the  anchor.  Thus,  consider  having  to  judge  whether  samples  come  from 
an  urn  containing  predominantly  red  or  blue  balls  (70/30  in  both  cases).  You 
first  draw  a  sample  of  8  that  shows  (5R,  38).  Thereafter,  you  draw  another 
sample  of  8  with  the  result  (7R:1B).  After  each  sample,  you  are  asked  how 
likely  it  is  that  you  have  drawn  from  the  predominantly  red  urn.  When  the 
sample  evidence  is  in  the  order  given  here,  people  seem  to  anchor  on  the  first 
sample  (5:3)  and  then  adjust  up  for  the  second  (stronger)  sample.  However, 
when  the  order  of  the  samples  is  reversed,  people  anchor  on  (7:1)  and  adjust 
down  for  the  weaker,  second  sample.  This  effect  cannot  be  accounted  for  by 
assuming  that  people  are  using  a  Bayesian  procedure  (which  treats  the  two 
situations  as  equal),  but  it  does  follow  from  an  anchoring  and  adjustment 
process  in  which  the  anchor  is  weighted  more  heavily  than  the  adjustment. 

Third,  the  results  of  experiment  2  provide  important  evidence  regarding 
the  process  assumed  to  underlie  the  model,  in  addition  to  the  fact  that  the 
experimental  manipulation  of  source  credibility  affected  0'  as  predicted, 
two  other  results  were  found;  a  positive  correlation  between  6"  and  MAD  and, 
the  stability  of  individual  differences  in  0",  8,  and  MAD  across  scenarios. 

The  first  result  bears  directly  on  the  nature  of  the  adjustment  process  since 
it  suggests  a  "cost”  of  engaging  in  mental  simulation;  namely,  a  concomitant 
lack  of  control  over  one's  strategy  (Hammond  s  Summers,  1972).  The  second 
result  suggests  strong  personal  propensities  in  evaluating  evidence  that 
transcend  the  particular  content  of  scenarios.  While  it  is  too  early  to 
explicate  the  nature  of  these  individual  differences,  their  existence  lends 
support  to  the  idea  that  the  parameters  of  our  model  do  capture  important 
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aspects  of  the  process  that  determines  judgments  under  ambiguity. 

While  our  model  accounts  for  the  rather  simple  inferences  we  have 
studied,  it  also  relates  to  an  important  class  of  inferences  that  result  from 
"surprise.”  Consider  Figure  5,  which  shows  one's  expectations  for  p  as  a 

Insert~Flgure  5  about  here 

function  of  the  credibility  of  *he  source  and  the  dissimilarity  of  the  signals. 
First,  note  that  when  credibility  and  dissimilarity  are  high,  one  mxpacta  p  to 
be  very  high  or  low  (recall  our  earlier  example  of  cameras  taking  pictures  of  a 
bank  robber).  However,  imagine  that  one  camera  showed  the  bank  robber  to  be 
white,  and  the  other  showed  him  to  be  black.  Such  a  result,  where  p  *»  .5,  would 
be  surprising  given  the  credibility  of  cameras  and  the  dissimilarity  of  white  and 
black  robbers.  Indeed,  the  data  "are  not  good  enough,"  which  is  represented  by 
the  range  of  p  indicated  by  the  two-headed  arrow.  Second,  consider  the  low 
credibility-low  dissimilarity  situation;  e.g. ,  the  taste-test  scenario.  Imagine 
that  you  were  told  that  of  the  20  people  in  the  Pepsi  vs.  Coke  taste-test,  all 
correctly  identified  the  drink  as  Pepsi.  Such  a  result,  where  p  -  1 ,  would 
be  surprising.  However,  this  type  of  surprise  is  one  where  the  "data  are  too 
good"  rather  than  not  good  enough.  Thus,  there  are  two  types  of  surprise  and 
both  occur  when  ambiguity  is  low.  Zndeed,  when  ambiguity  is  high,  expecta¬ 
tions  are  weak  and  surprise  (which  results  from  a  violation  of  expectations) 
is  unlikely,  this  situation  characterizes  the  off-diagonal  cells  in  the 
figure  and  accounts  for  our  labeling  these  "little  surprise.” 

Although  our  conceptual  scheme  makes  clear  when  surprise  is  likely  to 
occur,  it  cannot  handle  the  variety  of  possible  reactions  it  can  engender. 

For  example,  when  data  are  not  good  enough,  it  is  possible  to  reduce  the 
credibility  of  the  source  (e.g.,  the  cameras  were  broken),  synthesize  the 
hypotheses  (there  were  two  bank  robbers,  one  white  and  the  other  black),  or 


otherwise  make  sense  of  the  date  by  changing  the  story  (e.g. ,  there  were  two 
bank  robberies  on  successive  days) .  On  the  other  hand,  when  data  are  too 
good,  inferences  of  fraud,  collusion,  and  the  like,  are  possible  (see,  e.g., 
Ramin,  1974  on  Burt's  twin  data;  Bishop,  Fienberg,  s  Holland,  1975,  on 
Handel's  pea  experiments).  An  interesting  aspect  of  such  inferences  is  that 
the  surface  meaning  of  the  data  can  suggest  the  opposite  conclusion;  e.g. , 
consider  someone  who  "pro teste th  too  much,"  or  a  suspect  who  was  "framed"  for 
a  crime.  Indeed,  this  is  implied  by  our  model.  Specifically,  consider  the 
case  of  totally  unreliable  data  which  imply  0  ■  1  (see  equation  (6)).  In 
this  case, 

S(PA)  -  1  -  P®  (ID 

Thus,  as  pA  increases,  s(p^)  decreases.  More  generally,  as  0 
increases,  it  will  reach  a  point,  conditional  on  p^  and  5,  where  the 
evidence  for  a  hypothesis  will  start  to  be  counted  against  it. 

Ambiguity  and  Risk 

Although  the  importance  of  ambiguity  for  understanding  risk  has  been 

evident  since  Ellsberg's  original  article,  its  omission  from  the  voluminous 

literature  on  risk  is  puzzling.  One  reason  may  be  the  reliance  on  the 

explicit  lottery,  with  stated  payoffs  and  probabilities,  for  representing 

risky  choice.  Indeed,  as  Lopes  (1983)  has  noted, 

The  simple,  static  lottery  or  gamble  is  as  indispensable  to 
research  on  risk  as  is  the  fruitfly  to  genetics.  The  reason 
is  obvious;  lotteries,  like  fruitflies,  provide  a  simplified 
laboratory  model  of  the  real  world,  one  that  displays  its 
essential  characteristics  while  allowing  for  the  manipulation 
and  control  of  important  experimental  variables.  (1983,  p.  137) 

It  should  be  further  noted  that  the  explicit  lottery  has  been  of  equal 

importance  to  those  interested  in  axiom  systems  and  formal  models  of  risk. 

While  explicit  lotteries  have  been,  and  continue  to  be,  useful  for 


studying  risk,  the  aabiguities  surrounding  real  world  processes  in  domains 
such  as  nuclear  power,  environmental  safety,  and  the  like,  accentuate  the 
incomplete  nature  of  such  representations.  Indeed,  Bllsberg  pointed  out  the 
particular  importance  of  ambiguity  in  understanding  people's  reactions  to  new 
technologies  (also  see,  Edwards  &  von  ttinterfeldt,  1982,  for  a  historical  look 
at  reactions  to  earlier  technological  innovations).  In  any  event,  the  neglect 
of  ambiguity  in  theories  of  risk  is  slowly  giving  way  to  interest  at  both  the 
formal-axiomatic  level  (e.g.,  Fishburn,  1983a,  1983b;  Gardenfors  fi  Sahlin, 
1982;  1983;  Morris,  1983)  as  well  as  the  psychological  level  (Lopes,  1983). 

Our  model  of  inference  under  ambiguity  has  several  implications  for 
descriptive  models  of  risky  choice.  First,  since  the  8  parameter  can  be 
related  to  the  desirability  of  outcomes,  the  model  implies  a  form  of  utility  x 
probability  interaction.  Moreover,  experiment  4  provides  direct  evidence  for 
this  interaction.  However,  the  utility  x  probability  interaction  only  has  an 
effect  in  the  presence  of  ambiguity,  i.e.,  when  0  >  0.  Thus,  whereas  the 
bilinear  assumption  may  be  appropriate  for  models  that  exclude  the  effects  of 
ambiguity  (e.g.,  Kahneman  fi  Tveraky,  1979),  it  is  not  clear  that  this 
assumption  can  be  maintained  when  ambiguity  prevails.  Second,  both  our  model 
and  data  show  that  the  net  effect  of  the  adjustment  process  (i.e.,  k)  varies 
in  magnitude  with  the  level  of  pA.  Thus,  theories  of  inference  that  weight 

•e 

probabilities  according  to  some  •reliability"  factor  (e.g.,  Gardenfors  fi 
Sahlin,  1982)  need  to  consider  this  interaction  explicitly  to  achieve 
descriptive  realism,  third,  the  model  highlights  the  difficulty  of  inferring 
underlying  attitudes  toward  risk  from  choices  made  in  ambiguous  circumstances. 
For  example,  a  person  buying  insurance  against  a  potential  loss  that  is 
contingent  on  a  small,  ambiguous  probability  might  appear  risk  averse; 
however,  the  same  person  could  appear  to  be  risk-seeking  if  the  probability 


wn  larger  (cf.  experiment  4).  Via  wad  froa  the  framework  of  expected  utility 
theory,  such  behavior  would  imply  an  inconsistent  utility  function.  However, 
this  need  not  be  the  case  since  the  apparent  changes  in  risk  attitude  could 
result  froa  the  effects  of  ambiguity.  At  the  very  least,  our  model  provides  a 
way  of  analysing  the  sources  of  such  seemingly  inconsistent  behavior.  As 
Hogarth  and  Kunreuther  (1984)  point  out,  scholars  have  often  attempted  to 
resolve  anomalous  choice  patterns  by  considering  different  forms  of  utility 
functions.  On  the  other  hand,  transformations  of  probabilities  have  received 
far  less  formal  attention  (for  an  exception,  see  Karmarkar,  1978).  Finally, 
whereas  our  model  does  not  silicate  all  aspects  of  ambiguous  choice,  it  does 
suggest  exciting  possibilities  for  further  work  in  this  area. 

extensions  to  Multiple  Souroes  and  lima  Periods 

To  examine  inferences  under  ambiguity  in  depth,  we  have  restricted 
ourselves  to  how  evidence  froa  a  single  source  is  evaluated  at  one  point  in 
tine.  However,  consider  the  more  realistic  situation  where  decision  makers 
receive  information  from  multiple  source-types  (including  base  rates)  over 
multiple  time  periods.  The  aggregation  of  information  over  source-types  and 
tine  can  be  conceptualized  by  tut  "evidence  matrix"  that  has  source-types  for 
rows  and  time  periods  for  columns.  Such  a  matrix  is  shown  in  Figure  6.  The 

Insert "fijure ~6 "about "hare 

entries  in  each  cell  of  the  matrix  reflect  the  conflicting  evidence  received 
from  a  source-type  in  that  period.  The  matrix  provides  a  simple  yet  powerful 
way  to  look  at  a  wide  variety  of  inference  problems.  In  particular,  by 
focusing  on  source-types  (rows)  or  time  periods  (columns),  one  can  look  at  the 
combining  of  information  either  longitudinally,  cross-sectionally,  or  both. 
Furthermore,  the  issues  of  reliability  and  ambiguity  become  quite  complex  here 


since  there  can  b«  differential  source  reliability,  varying  numbers  of  reports 
per  source,  and  the  sources  may  not  be  "independent."  While  the  challenge  of 
understanding  how  people  incorporate  such  factors  into  their  judgments  is 
formidable,  the  coaplexity  of  inferences  in  real  world  settings  requires  that 
attention  be  paid  to  these  issues. 

CONCLOSION 

In  considering  the  role  of  ambiguity  and  uncertainty  in  inferential 
judgments,  we  have  developed  a  quantitative  model  that  accounts  for  much 
existing  data  as  well  as  our  own  experimental  findings.  Furthermore,  we  have 
shown  how  this  model  relates  to  Keynes'  idea  of  the  weight  of  evidence,  the 
non-additivity  of  complementary  probabilities,  risky  choice,  and  current  work 
on  cognitive  heuristics.  Moreover,  since  inference  involves  "going  beyond  the 
information  given"  (Bruner,  1957),  an  important  way  to  do  this  is  to  con¬ 
struct,  via  imagination,  "what  might  have  been"  or  "what  might  be."  Such 
constructions,  whether  the  result  of  a  cognitive  simulation  process  as 
proposed  here,  or  more  elaborate  processes  (as  in  resolving  surprise) ,  pose  an 
interesting  and  important  trade-off  for  the  organism.  On  the  one  hand,  there 
are  costs  of  investing  in  imagination;  increased  mental  effort  and  the 
discomfort  that  results  from  greater  uncertainty.  On  the  other  hand,  the 
benefits  of  considering  the  world  as  it  isn't,  protects  one  from  over- 
confidence  and  its  nonadaptive  consequences.  Thus,  finding  the  appropriate 
compromise  between  "what  is*  and  "what  might  have  been"  (or,  "what  might  be"), 
is  central  to  inferences  under  ambiguity  and  unci  rtainty. 
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■  0,  1  there  is  no  ambiguity.  Hence,  the  relation  between  p^ 
and  S(pA)  should  be  discontinuous.  Indeed,  the  lack  of  ambiguity  at  the  end 
points  provides  a  rationale  for  the  discontinuity  of  the  decision-weight 
function  and  this  implies  the  "certainty  effect"  of  prospect  theory  (i.e.,  the 
value  of  sure  gambles  is  heightened  either  positively  or  negatively). 

2A  listing  of  the  program  is  available  from  the  authors. 


APPENDIX  A 


fills  appendix  considers  the  effects  of  different  assumptions  concerning 
the  weights  given  in  imagination  to  values  of  p  greater  and  smaller  than 
p^.  In  equation  (4),  differential  weighting  is  achieved  by  the  0  parameter; 

a 

i.e.,  k  -  6(1  -  p. )  and  k  -  6p  .  However,  one  could  also  consider  linear 
g  A  8  A 

weighting  schemes  where  the  mights  given  to  6p  and  6(1  -  p  )  sum  to  one 

A  A 

(i.e*,  a  weighted  averaging  process),  or  where  the  mights  do  not  sum  to 
one.  For  the  former,  let 

\-\m 9“<’  -  V  - 9"  •  *>'»  W1) 

■  S(»  -  *»> 

where  0  <  w  <  1  is  the  relative  might  given  to  greater  values . 

Substituting  (A.1)  into  equation  (6),  m  obtain, 

S1 (PA)  «  PA  +  9  (w  -  PA>  (A. 2) 

where,  S1 (pft)  is  used  to  denote  alternative  model  1.  Note  that  in  this 
model,  (pA>  is  regressive  with  respect  to  p.  Although  this  model  has 
appealing  features,  it  is  easy  to  show  that  it  does  not  capture  some  aspects 
of  our  model  and  data.  Specifically,  it  always  predicts  additivity  of 
judgments  of  complementary  events,  i.e., 

si  V  ♦  V’  - V  ■  * 6  <*  -  »»>  * 11  -  *»’  (A  3) 

♦  9  1(1  -  »)  -  (1  -  p,)l  -  1 

However,  non-additivity  will  occur  if  the  weights  accorded  to  9(1  -  p^) 
and  6pA  do  not  sum  to  one.  A  special  case  of  this  model,  which  m  denote 
S2(PA),  *nd  which  is  similar  to  the  S(pA)  model  used  in  the  paper,  is  one 


(A. 4) 


This  yields. 


k  -  k  -  0(1  -  p.)  -  0mp  (n  >  0) 

9  I  A  A 


W  •  »»  *  9  <’  -  PA  *  *P»1 


(A. 5) 


such  that  the  additivity  conditions  are. 


S_(pa)  +  S,(1  -  P.)  -  P,  +  e  (1  -  p  -  op.I  +  (1  -  P.)  +  9  tp.  -  m(1-p  )] 


2  A 


rA'  'A  ~  “  'A  A 
-1+0  (1 -a) 


(A. 6) 


Thus,  for  m  >  1,  the  model  predicts  sub-additivity;  for  m  -  1, 
additivity;  and  for  m  <  1,  super-additivity.  The  difference  between 

and  S(pA>  i*  that  the  former  predicts  a  constant  amount  of  non¬ 
additivity  irrespective  of  the  value  of  p^.  Zn  the  S(pA)  model,  the  level 
of  pA  affects  the  amount  of  additivity,  this  is  shown  in  equation  (7), 
which  is  reproduced  here  for  convenience. 


S(p  )  +  S(1  -  p  ) 


1  +  0  tl  - 


D  • 


(i  -  pJb] 


(A. 7) 
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